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Preface 


Scientists and engineers love simplicity. We prefer simple laws of nature like 
Newton’s laws of motion to a complicated law with tens of equations, simple word 
and phrases to explain the nature of life to books with hundreds of pages, and a 
simple room just with a vessel with flowers to a colorful room full of furniture. 
Simple is best. 

But simple is not easy. It is rather very difficult. Simplification is probably the 
most difficult problem of design. In this book, we will see how to solve this type 
of problems in engineering. The idea is to introduce the sparsity. It makes the hard 
problem solvable. 

The method of sparsity becomes more and more popular in engineering, in 
particular, in signal processing, machine learning, statistics, etc. It is known as 
compressed sensing, compressive sampling, sparse representation, or sparse mod- 
eling. More recently, this method has been applied to systems and control to 
design resource-aware control systems. This book gives a comprehensive guide 
to sparsity methods for systems and control, from standard sparsity methods in 
finite-dimensional vector spaces (Part I) to optimal control methods in infinite- 
dimensional function spaces (Part II). 

The primary objective of this book is to give How to use sparsity methods for 
several engineering problems. I omitted theoretical aspects of the sparsity meth- 
ods, such as characterization of sparse solutions based on random matrix theory 
and convergence of optimization algorithms. If you are interested in those theoret- 
ical aspects, please refer to the references listed in the section titled “Further read- 
ing” at the end of each chapter. Instead, I provide MATLAB programs by which 
you can try sparsity methods by yourself. You will obtain deep understanding of 
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sparsity methods by running these MATLAB programs. MATLAB programs and 
other information on this book are found at 


https://nagahara-masaaki.github.io/som_en 


The book is an English translation from the author’s book Sparse Modeling pub- 
lished in 2017 by Corona Publishing, Japan. Also, the contents of this book have 
been updated and added based on two university courses: Sparsity Methods for Sys- 
tems and Control (SC637) taught at Indian Institute of Technology (IIT) Bombay 
in 2018 during my sabbatical, and Sparse Modeling (M153F31) at The University 
of Kitakyushu, Japan during 2018-2020. These courses are for graduate students, 
but I believe undergraduate students can read with basic knowledge of linear algebra 
and elementary calculus. Also, this book (especially Part II) appeals to professional 
researchers and engineers who are interested in applying sparsity methods to systems 
and control. The courses start from finite-dimensional optimization (i.e. Part I) to 
optimal control (Part II), but if you want to quickly know about the optimal con- 
trol, you can omit Part I and directly start from Chapter 7 (Part II). Chapter 1 is an 
introductory chapter, where I mainly mention the history of sparsity methods in 
engineering. You can omit (or read later) Chapter | since this chapter is completely 
independent of the other chapters. 
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Notation 


A finite-dimensional vector is represented in a bold face, e.g. æ, when the size of 
the vector is greater than 2. For one-dimensional vectors, we do not use a bold face 
and simply write like x, regarding as a scalar. 

We denote by R” the set of n-dimensional real column vectors, and by R”*” 
the set of m x n real matrices. The transpose of a vector æ and a matrix A are 
respectively denoted by x! or A. The i-th element of a vector æ and the (i, j)-th 
element of a matrix A are respectively denoted by (a); and [A];j. We denote by Z 
the set of integers and by N the set of natural numbers, that is, N = {1, 2,3,...}. 

For a vector x € R”, supp(ax) denotes the support set of a, that is, the set of 


nonzero elements of  =[Xx1,.--.,Xn]| € R": 
supp) = {i € {1,... n} : x; #0}. (1) 
The £” norm of æ € R” is defined by 
lællo = #(supp(x)), (2) 


where #(-) returns the number of elements of the argument set. The £P norm with 
p > 1 is defined by 


n 1/p 
llellp = [> wr] ' (3) 
i=l 


and the £% norm by 


A 
Il lloo = max [xi]. (4) 
121,255.45 
In Part II, these norms will be denoted by ||x|| po, ||x||¢r, and ||a|| geo to distinguish 
norms for continuous-time signals. 


x Notation 


For a vector x € R”, and an index set S$ C {1, 2, ..., n}, we denote by ag the 
restriction of æ to S. Ifa = [x1, x2, ..., Xn]! and S = {ij,i2,..., in} I < i1 < 
ig <- < ip < n), then 


gs = [kis Xis -3 Xi]! € RÉ. (5) 
Also, for ® = [1, $2, ..., Pn] € R”””, Ds is defined as 
Os = [i,, Pins +++ Piz] e R”*K, (6) 


The complement of a set S is denoted by S°. 
Let f : [0, T] > R bea measurable function with T > 0. The support of f is 
denoted by supp( f) and defined by 


supp(f) = {t € [0, T]: fŒ) # 0}. (7) 
The L? norm of f is defined by 
If llo = u (supp(f)), (8) 
where u is the Lebesgue measure over R. The L? norm with p > 1 is defined by 
T 1/p 
Ifllp = { / forar] , (9) 
and the L% norm by 
IIflloo = sup If@I. (10) 
te[0,T] 


We denote by L?(0, T) with p > 1 or p = o the set of functions with finite L? 
norm. 


For a function f : R” — R, the gradient V f is defined by 


T 


0x X xX2° Xn 


(11) 


v= 


We say a real-valued function f(n), n € N, is O(g(n)) if 


lim sup 
n—- Oo 


g(n) 
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Chapter 1 


Introduction 


In this chapter, we briefly review the history of sparsity methods in science and 
engineering. The chapter will motivate you to learn this topic. The content of this 
chapter is independent of the other chapters, and readers interested in the techni- 
cal aspects of sparsity methods can skip this chapter without much effect on their 
understanding of the rest of the book. 


1.1 Occam’s Razor 


At the root of sparsity methods is the idea that one should not assume more 
than is necessary to explain certain things. This is known as Occams razor, 
also called the daw of parsimony, developed by Ockham in the 14th cen- 
tury. This idea was not invented by Ockham, but rather long before him, for 
example, by Claudius Ptolemy (Q0AD-168AD) and Aristotle (384BC-—322BC). 
This is a very familiar concept to us, especially in Japan, where there is a 
culture of Zen and Wabi/Sabi, which can be roughly translated as simple is 
best. 

There is a satirical depiction of the opposite of Occam’s razor, Rube Goldbergs 
machine. Figure 1.1 shows an example of Goldberg’s machine. The machine is 
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Figure 1.1. Goldberg’s machine (self-operating napkin). 


“self-operating napkin,” which automatically wipes off the dirt from the beard when 


he drinks soup. This caricature depicts a machine that performs very simple actions 


with extreme complexity, and satirizes the large-scale mechanization in the first half 


of the 20th century. The machine runs as follows. 


. The man raises the spoon of soup (A) to his mouth. 

. The string (B) attached to the spoon (A) is pulled. 

. The ladle (C) moves. 

. Cracker (D) flies on the parrot (E). 

. The parrot (E) takes off after the cracker (D). 

. The perch (F) tilts. 

. The seeds (G) on the perch (F) spills out, and goes into the pail (H). 
. The string (I) is pulled by the extra weight in the pail. 

. It ignites the cigar lighter (J). 

. The fuse of the rocket (K) is lit and it takes off. 

. The knife (L) attached to the rocket (K) cuts the string (M). 

. The pendulum swings and the napkin (N) wipes the dirt from the 


beard. 


The Goldberg machine is obviously strange. However, in this highly technolo- 


gized society, we might have created something like the Goldberg machine without 


even realizing it. The sparsity method is therefore an essential technique to avoid 


such a situation. 
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1.2 Group Testing 


Group testing is one of the first attempts to apply a sparsity method to a scientific 
problem. Group testing was proposed by Robert Dorfman in 1948 as a problem 
of finding an infected person among a large number of patients in a small number 
of blood tests [32]. For example, suppose that only one of eight patients is infected 
with a disease, which can be detected by examining the blood. Now, we have eight 
blood samples from the eight patients. Since blood testing is expensive and time- 
consuming, we want to identify the infected person as few times as possible. In this 
case, there is a good way to do this (see also Figure 1.2). 


e (TEST 1) We first divide the blood of the eight patients into two groups of 
four patients, and take a little bit of blood from each of the eight patients, 
and mix it for each group. Since there is only one infected person, the blood 
from either group will test positive. 

e (TEST 2) Divide the group that tested positive into two groups of two 
patients, and do the same thing. At this point, the number of suspicious per- 
son has been narrowed down to two. 

e (TEST 3) Finally, by examining the blood of the two individuals separately, 
the infected person can be uniquely identified. 


By this method, it is possible to identify an infected person in six tests, whereas 
eight tests would be required for an individual blood test. In general, according to 
the above method, if there is only one infected person among 27 people, we can 


TEST 1 cusses 


— a = 


<i ad 


negative 


TEST 3 . aa 
positive : negative 


Figure 1.2. Group testing from eight blood samples. 


= 


positive 
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identify the infected person in less than 2T tests. For example, for 1,024 patients, 
only 20 tests are needed to identify the infected person. We can see that group 
testing can dramatically reduce the number of tests compared to testing all patients’ 
blood individually. We would like to consider a sophisticated method like this in a 
general situation where a few people in 100,000 for example are infected, instead of 
examining the blood of 100,000 people individually. This is the problem of group 
testing. 

Now let us describe the problem of group testing in detail. Let n be the number 
of people to be tested. Define a variable x; representing whether the i-th person 
(i e {1,2,...,}) is infected or not as 


1, if the i-th person is infected, 
xj 2 ae (1.1) 
0, otherwise. 
Define an n-dimensional binary vector that takes values of 0 or 1 as 
a © [x1,%2,...,%n]" € (0, 1)", (1.2) 


where, {0, 1}” is the set of n-dimensional vectors whose elements are 0 or 1. The 
problem here is to find this n-dimensional binary vector. Of course, if we examine 
each one of them individually, we can determine the vector x with n tests, but here 
we want to identify x with a much smaller number of tests. 

Let us consider a subset S of the set {1,2,..., n}. We define a function A that 
returns 1 if there is an infected person in S and 0 otherwise as follows: 


0, otherwise. 


A(S) = | (1.3) 


In group testing, a number of people are selected among n people to form a 
group, and their blood is mixed to test for infection. We form m groups denoted 


by S1, S2, ..., Sm. Define vector y of the results of the tests for these groups, 


y Ê [A(S1), A(S2),..., A(Sm)]' € {0, 1)". (1.4) 


Also, we define matrix ® e {0, 1}’"*” by 


l, ifieS; 
wel ee (1.5) 


0, otherwise, 


where ®;; is the (i, j) element of ®. 
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Now, we introduce the /ogical disjunction ® and the logical conjunction ® for 
binary numbers defined as 


080=0, O0G1=1, 160=1, 1O61=1 


(1.6) 
0@0=0, 0@1=0, 1@0=0, 1@1=1. 


Then the relationship between the vector y of test results and a of infection is 
represented as 


Or = y, (1.7) 


where the sum and product in this representation are taken as the logical disjunction 
and logical conjunction. Since the goal is to dramatically reduce the number m of 
tests, m is much smaller than the number of patients n. In this case, the above 
equation (1.7) has infinitely many solutions in general (if solutions exist). This 
means that we cannot uniquely determine the original æ only from (1.7). However, 
if we assume that the number of infected people is much smaller than n, or vector 
x is sparse (i.e., & has very few nonzero elements), then we can formulate group 
testing as the following optimization problem: 


minimize ||a||o subject to Ox = y, (1.8) 
awe{0,1}”" 


where ||ællo denotes the €° norm, the number of nonzero elements in æ. We will 
discuss the £? norm in detail in Chapter 2. The optimization problem is a combi- 
natorial optimization problem or a binary optimization problem, and the computa- 
tional burden increases exponentially as the size n increases. Dorfman’s paper [32] 
proposes various methods for efficiently running group testing, and an increased 
interest in this topic has emerged especially with the recent development of sparsity 
methods. For recent methods, see, for example, [1, 3]. 


1.3 Optimization with £! Norm 


As we will see in this book, €!-norm optimization is one of the most important 
techniques for sparsity methods as approximation of £°-norm optimization. 


1.3.1 Signal Reconstruction 


The first study of optimization with the £! norm as a sparsity method is found in 
the dissertation by Franklin Logan in 1965 [69]. Logan considered the problem of 
signal reconstruction from noisy data. In his dissertation, he showed that €!-norm 
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—> 
supp(G) supp(n) 


Figure 1.3. Reconstruction from noisy signal f(t) = g(t) + n(t): the Fourier transform G(w) 
of g is band-limited in the frequency domain, and the noise n is localized in the time 
domain. 


minimization completely eliminates the noise when the original signal is band- 
limited to a certain frequency and the noise is well localized (i.e., sparse) on the 
time axis. More precisely, if we have noisy observation 


fO=2t)+n(t), t=0,1,2,... (1.9) 


where the Fourier transform G (œ) of g has its support on a low-frequency range, 
and the support of n (t) is sufficiently short, then the £! optimization leads to per- 
fect reconstruction of g from f. This is called Logans phenomenon. Figure 1.3 
illustrates the signal assumptions (band limitation and sparsity) in Logan’s phe- 
nomenon. The sparsity method by €!-norm minimization was then extended in 
[31] to signal recovery when the original signal is sparse in the frequency domain 
(i.e., the original signal may not necessarily a low-frequency signal). 


1.3.2 Geophysics 


In the field of geophysics, sparsity methods by £! optimization have been proposed 
since the 1970s. The structure of the strata can be estimated by generating artificial 
earthquakes near the ground surface and observing the reflected waves. This is a 
method called the reflection seismic survey. This is a problem of system identification 
or an inverse problem, where the characteristics of the system is estimated from its 
inputs and outputs. As shown in the left-hand diagram in Figure 1.4, we consider a 
linear system R with input (wave by the artificial earthquake) u(t) and the output 
(reflected wave) y(t). 

The problem is to find the impulse response r(t) of the system R from the 
input/output data of u(t) and y(t). In the case of seismic reflection waves, the 
impulse response r (t) can be assumed to be localized in time (see the right-hand 
figure in Figure 1.4). That is, the impulse response is sparse. From this, the £! regu- 
larization was proposed to reconstruct the sparse impulse response [23, 100, 109]. 
These are other early studies that used the idea of sparsity. 
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r(t) 


Figure 1.4. Linear system R with input u(t) and output y(t) (left), and its sparse impulse 
response r(t) (right). 


1.3.3 Neural Networks 


In the field of neural networks, the idea of sparsity has also been investigated. Since 
the 1980s, Masumi Ishikawa has been proposing a method to avoid overfitting 
by introducing the £! norm regularization into the training of multilayer percep- 
trons [58]. He proposed to sparsify the coupling weights of the network to avoid 
overfitting. This is a method of machine learning that takes advantage of the human 
brain’s ability to forget, which is called structure learning with forgetting. The method 
can lead to an explainable structure of multilayer neural networks, which allows 
people to understand the learning results. Also, the technique of dropout in recent 
deep neural networks is based on a similar idea of sparsity [42, 106]. 


1.3.4 Statistics 


In statistics, the method called LASSO (Least Absolute Shrinkage and Selection 
Operator) is the most famous method with sparsity. Let us consider polynomial 
curve fitting from given data. If we can set many of the coefficients (parameters) to 
zero, the terms with zero coefficients will not affect the estimation at all and we can 
avoid overfitting. Such a method is called a shrinkage method in statistics. LASSO, 
the shrinkage method with £! norm regularization, was proposed by Robert Tib- 
shirani in 1996 [110]. The idea of LASSO has been extended to elastic net regu- 
larization [119] with the sum of the £! norm and the squared £? norm as a reg- 
ularized term, and group LASSO with the sum of weighted £? norms for grouped 
vectors [116]. We will study LASSO in Chapter 3. 


1.3.5 Signal Processing 


The first research area where sparsity methods became a hot topic is signal pro- 
cessing. A method called basis pursuit with £! norm optimization to recover sparse 
signals was proposed in 1994 by Chen and Donoho [21] at the 28th Asilomar 
Conference on Signals, Systems, and Computers.' In addition, the total variation 


1. Later, this work was published as a journal article [20] with Saunders as a co-author. 
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denoising, by using the £! norm of the difference of a signal was proposed in 1992 
by Rudin eż al. [97]. More recently, Donoho et al. proposed a new theory of sens- 
ing and recovery called compressed sensing [30] in 2006, which is a theoretically 
refined version of the basis pursuit. In the same year, Candes and Tao also pub- 
lished a paper on this topic [16]. 2006 is the year that the current development 
of compressed sensing began. Compressed sensing was a topic in the fields of sig- 
nal processing and information theory at that time. However, the topic is now 
widely attracting a lot of attention in various research fields including systems and 
control. 


1.4 Sparsity Methods for Systems and Control 


Here we describe a brief history of sparsity methods for systems and control to 
provide a motivation for studying the new research topic. 


1.41 Minimum Fuel Control and L! Optimization 


In the field of automatic control, sparsity has been recognized for a long time. 
An example is the minimum fuel control, which is an optimal control that mini- 
mizes the L! norm of control among feasible controls. The minimum fuel control 
has been actively discussed in the field of control theory since the early 1960s [2]. 
At that time, the space race between the United States and the Soviet Union was 
most heated, and the minimum fuel control has a background in the discussion 
on how to reduce the fuel consumption of rockets from the earth to the moon, 
for example. As we will see in Chapter 7, the minimum fuel control is a bang- 
off-bang control that takes ternary values of +u max (the maximum amplitude that 
the control can produce) or zero, under some assumptions. When the control 
takes a value of zero, the rocket engages in inertial flight, and hence it can reduce 
fuel consumption during this time. This is why the control is called minimum 


fuel. 


1.4.2 Maximum Hands-off Control 


The L!-optimal minimum-fuel control is shown to be equivalent to the L°-optimal 
control (the sparsest control) in [82, 83] under the assumption of non-singularity. 
The sparse control with minimum L? norm is called the maximum hands-off con- 
trol. The mathematical properties of the maximum hands-off control was investi- 
gated in [19, 55]. This has also been extended to time-optimal control [57], dis- 
tributed control [52, 54], continuous control [85], and infinite-dimensional sys- 
tems [51]. The maximum hands-off control will be discussed in Chapters 8—10. 
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Oscilloscope 


Scope vertical input e7 i“ 
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horizontal 
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Measurement of 


Pushbuttons (4) 
controlling gas jets 


Figure 1.5. The rocket cockpit illustrated in 1960’s textbook [2]. The pilot just operates 
the switches with the observation of the position and velocity of the rocket. This figure 
is from [2, p. 608, Figure 7-62]. 


1.4.3 Discrete-valued Control 


The ternary property of the minimum fuel control is also understood as discreteness. 
It has been known since the 1960s that certain types of optimal control show such 
discreteness of control. In fact, the classical textbook [2] states that the discrete- 
valued control can be implemented as a few switches in the rocket cockpit (see 
Figure 1.5). Of course, it is obvious that such a simple manual control would be 
useless and dangerous to fly into space, and that an automatic control with a feed- 
back mechanism is essential. However, the discrete-valued control expressed only 
by switching on and off, is very important in recent resource-aware networked con- 
trol systems, such as the Internet of Things (IoT) or Cyber-Physical Systems (CPS). 
Discrete-valued control with the idea of sparsity was proposed in [53, 56]. In these 
papers, the minimization of the sum of absolute values (SOAV) of the control to 
enhance the discreteness. We will study the SOAV control in Section 10.2 of Chap- 
ter 10. 


1.4.4 Robust Control and Rank Minimization 


The optimal control mentioned above requires a complete mathematical model of 
the controlled object (e.g., a rocket). However, there should be uncertainties in the 
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model and parameters in reality, and how to deal with them has been a major chal- 
lenge in automatic control theory. Robust control, a theory of control systems design 
that takes uncertainty into account, was actively studied in the 1980s, with H° 
control theory being one of the most successful examples (see e.g. [1 17]). Some basic 
problems in H® control boil down to the problem of finding a matrix satisfying 
linear matrix inequalities (LMIs) [11, 33]. An LMI is a convex constraint, which can 
be easily solved using convex optimization (especially the interior point method). 
However, if you want to control a large-scale and high-dimensional system by a 
simple and much lower-dimensional controller, or if you need to treat structured 
uncertainties, the problem becomes LMIs with a matrix rank constraint (or rank 
minimization), which is much more difficult to solve since the rank constraint is 
non-convex.” 
The rank minimization problem is in general described as 
pe rank(X) subject to M(X) + Q > 0 (1.10) 
eRnxn 
where M(X) is a linear function of X, Q is a matrix, and the inequality A > B 
means A — B is positive semidefinite. It is easily shown that the matrix rank is the 
number of non-zero singular values (i.e. the € 0 norm), and hence this is a problem 
related to sparsity. As mentioned above, the €° norm is often approximated by the 
£! norm, and in this case, we minimize the sum of absolute values (i.e. the £! norm) 
of the singular values. This is called the nuclear norm and denoted by || X ||. That 
is, the rank minimization problem in (1.10) is approximated to the nuclear norm 
minimization: 
minimize || X||, subject to M(X) + Q >0 (1.11) 
XeR™ 
The pioneering work by Mesbahi and Papavassilopoulos [75] showed the equiv- 
alence between (1.10) and (1.11). Interestingly, this was published in 1997 prior 
to the theory of compressed sensing in 2000s. For the equivalence, they used the 
property of Z matrix, which has not been considered in standard compressed sens- 
ing theory. In this book, we do not deal with rank minimization. Readers who are 
interested in rank minimization may refer to [73]. 


1.4.5 Resource-aware Control for Networked Control Systems 


Sparsity methods have also been applied to networked control systems. A networked 
control system is a feedback control system where the communication between the 


2. The rank constraint can be equivalently transformed into a bilinear matrix inequality (BMI), which is also 
difficult to solve. 
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Wireless network 


CPU 


Figure 1.6. Networked control system. 


controlled object and the controller is limited. Figure 1.6 shows an example of a 
networked control system. In this system, sensor data from the drone is sent to the 
computer (CPU) via a wireless communication network. Based on the information, 
CPU updates the control values for the attitude, speed, and acceleration of the 
drone, and returns the control commands to the drone via the network. 

For networked control systems, sparsity methods play an important role to real- 
ize resource-aware control that can significantly reduce the communication and com- 
putational burden. In [39, 80, 81, 84, 86, 91], sparse control is proposed by using 
€! norm minimization for discrete-time systems, by which we can reduce the size of 
control packets that are sent through rate-limited communication networks. These 
are finite-horizon control and to obtain feedback control, we can adapt the receding 
horizon control or the model predictive control formulations. 

Minimum actuator placement is also an important sparsity method for resource- 
aware control. This is to minimize the number of actuators (or control inputs) that 
achieve a control objective (e.g. controllability). The problem has been discussed 
in (50,39; 90, 95,95; 95, 111]: 

For state feedback control, the control gain matrix is also  sparsified 
[29, 67, 68, 76]. The obtained feedback controller is sparsely structured and the 
design should achieve an optimal tradeoff between closed-loop performance and 
sparsity. See a review paper [61] by Jovanović and Dhingra for detailed discussion 
on this topic. 


Sparse Representation 
for Vectors 


DOI: 10.1561/9781680837254.ch2 


Chapter 2 


What is Sparsity? 


In this chapter, we explain the notion of sparsity, and introduce sparse representa- 
tion of vectors and functions. The notion introduced in this chapter is important 
throughout this book, and hence do not omit this chapter. 


Key ideas of Chapter 2 


e Sparsity of a vector is measured by its £? norm. 
e In sparse representation, a redundant dictionary of vectors is used. 


e In sparse representation, the smallest number of vectors are automati- 


cally chosen from a redundant dictionary that represent a given vector 


(€° optimization). 
P 


e The exhaustive search to solve €° optimization requires computational 


time that exponentially increases as the problem size increases. 
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2.1 Redundant Dictionary 


Let us consider the three-dimensional vector space RÌ. The standard basis for R? is 
formed with the following three unit vectors: 


1 0 0 
ep=]0], ex=|{1], eg =]0]. (2.1) 
0 (0) 1 


By using this basis, any three-dimensional vector y € R? can be represented as 


y = | y2 | =yie1 + y2€2 + y3e3. (2.2) 


In general, if you choose three linearly independent vectors 1, @2, and @3 from 
R3, then they form a basis for RÌ. That is, for any vector y € R?, there exist unique 
real numbers £1, £2, and {3 such that 


y = Pipi + iodo + f3¢3 (2.3) 


holds. Moreover, if 61, Q2, and @3 are unit vectors and orthogonal to each other, 
that is, 


l, i=j “3 
(bipi) = 6) Gi = » if =1,2,3, (2.4) 
$1.0) =o) ae j 
where (-, -) is the £? inner product (see also Section 2.3). Then @1, @2, and @3 form 
an orthonormal basis for RÌ, and the coefficients £1, £2, and 83 can be obtained by 
the inner product 


Maudit u 1 =1,2,3. (2.5) 


Exercise 2.1. How do you obtain the coefficients £1, £2, and £3 in (2.3), when 
1, $2, and @3 are linearly independent but they do not form an orthonormal 
basis? 


Let us consider another basis for R? with the following three linearly indepen- 
dent vectors: 


1 

ġı=ei+e= |1|, ġ2=e2:+e3=|1|, gG=exte,=|0 
0 1 1 

(2.6) 
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Figure 2.1. 6 vectors e1, e2, €3, 61, $2, $3 in RÈ. 


Combining these with the unit vectors in (2.1), let us form a set of 6 vectors 
{e1, €2, €3, @1, P2, 3}. Figure 2.1 shows these 6 vectors. With these vectors, con- 
sider the following representation of vector y € R3: 


3 3 
y=) aiei +) fidi. (2.7) 


i=l i=l 


This is a redundant representation, and there are infinitely many solutions for a; 
and f; (i = 1, 2, 3) to satisfy (2.7). For example, for y = [y1, y2, y3]', we have 
two solutions 


(a1, a2, 03, P1, P2, 63) = (1, Y2, y3, 0, 0, 0), (2.8) 


and 


(a1, a2, 03, P1, P2, B3) = (—y3, —y1, —y2, Y1, Y2, Y3). (2.9) 


Now, let us consider a situation where the cost to keep the values of the non-zero 
coefficients is very expensive due to an expensive memory device for example. Then 
we want to minimize the number of non-zero coefficients to reduce the cost. Let 
us consider a vector y € R? on the plane spanned by e1 and @p. For this vector, 
we have the following solution: 


y = aiei + frodo, (2.10) 


This expression has smaller number of non-zero coefficients than (2.8). This is a 
trivial example, and the cost of (2.10) is almost the same as that of (2.8). However, 
if we can find just 102 non-zero coefficients for a 106 (one million) dimensional 
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vector, the cost will be dramatically reduced. Such a technology is often called data 
compression, which is one of the biggest motivations of sparse representation. 


Example 2.2. The four cardinal directions form a redundant system to represent a 
direction in R?. We say for example “Go southwest” not “Go minus-north-minus-east” 
although the two are mathematically equivalent. 


Example 2.3. Imagine that you need to explain what an elephant is to a foreigner, 
who cannot speak English but has a small dictionary with 3000 words but the word 
“elephant.” You might say “Elephant is the largest living land animal that has a long 
nose, many of them live in African savanna...” then the foreigner will ask “What is 
savanna?” since the word is not in the foreigners dictionary. But if the foreigner has a 
large dictionary that has more than 1 million words, you just say “That is an elephant.” 
Some English teachers say you need to memorize only these 3000 words for conversation, 
but actually 3000 words are not enough at all for simple expression. 


Let us formulate this problem of sparse representation in a general form. Let us 


consider m-dimensional vector space R”, and a set of vectors {@1, @2,..., On} in 
R”, where m < n. Fora given vector y € R”, we find coefficients @1, @2,..., An 
such that 
n 
y= > ai Pi. (2.11) 


i=l 


We assume that m vectors in {ġ1, @2,..., dn} are linearly independent. We call 
such a set of vectors {@1, Q2, . . . , Qn} a dictionary (recall Example 2.3), and the 
elements @1, $2, ..., Ọn atoms.’ Note that the size n of the dictionary is larger 
than the size m of vector y. We call such a dictionary a redundant dictionary, or 
over-complete dictionary. 

Define a matrix ® and a vector x as 


@2> [di @2 i. pn] E R", êj . | eR’. (2.12) 
On 
Then, the equation (2.11) can be equivalently written as 


Ox = y. (2.13) 


1. We do not call them words. 
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The matrix ® is called a dictionary matrix, or a measurement matrix. Since the 
dictionary is redundant, the matrix ® is a fat matrix, that is, the number of columns 
is larger than the number of rows. Our problem is now described as follows: 


Problem 2.4 (Sparse Representation). Given a vector y € R” and a dictionary 
{h1, b2,..., bn}. Find the simplest representation of y that satisfies (2.13). 


In the next section, we discuss this problem with a fat matrix. 


2.2 Underdetermined Systems 


Let us consider the following system of linear equations with unknowns x1, x2, 
and x3: 


xı +x2+%3 =3 


2.14 
x, —x3=0 l ) 


Now there are three unknowns and two equations, and it is easily seen that there 
are infinitely many solutions. To represent all solutions, we use parametrization. All 
solutions of (2.14) are parametrized as 


xXp=t, x= —2t +3, x =t, (2.15) 


where t € R is a parameter. We call such a system of equations an underdetermined 
system, where the number of unknowns is larger than the number of equations. 
An underdetermined system is something like insufficient proofs for a detective 
to determine one among many suspects. For a detective, say Conan Edogawa,’ the 
two proofs (equations) in (2.14) are insufficient and he should seek one more proof 
to reveal the unique solution of the case. Thanks to his investigation, a proof was 
found, which said “the criminal is the smallest one among the suspects.” This is actu- 
ally a conclusive proof that can choose just one suspect. Let us find the smallest solu- 
tion among the candidates in (2.15). We use the £ 2 norm as a measure of the size, 
and we find the smallest €?-norm solution as follows. First, from (2.15), we have 


læ} = x? +22 +22 
=t + (2t +3) +r (2.16) 
= 6(t — 1)? +3. 


Then we can choose t = 1, and from (2.15), the solution is uniquely chosen as 
(x1, x2, x3) = (1, 1, 1). Case closed. 


2. See: https://en.wikipedia.org/wiki/Case_Closed 
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Let us generalize the above discussion. We consider a system of linear equations 
in a matrix form as 


Ox = y. (2.17) 


For example, the system in (2.14) can be represented in the matrix form (2.17) with 


X1 
1 1 1 3 
=|; 0 bal r2=|x]l, v=[ol: (2.18) 
X3 


We assume the size of matrix Ọ® is m x n wherem < n, that is, we consider an 
underdetermined system of equations. We also assume that there are m column 
vectors in {@1,..., Qn} that are linearly independent. In other words, we assume 
® has full row rank. Note that a matrix ® e R”*” is said to have full row rank if 
® is surjective, or 


rank(®) = m. (2.19) 


If rank(®) < m, then there exist redundant linear equations (i.e. there is at least 
one equation that is a linear combination of other equations). For example, the 
following system of equations 


xy tx24+43=3 
x, —-x3=0 (2.20) 
2x, +x. =3 
is redundant and the rank is 2 < 3. We here assume such redundancy should be 
eliminated beforehand. 
If © has full row rank, then for any vector y € R”, there exists at least one 
solution æ that satisfies the linear equation (2.17). Let denote by æo a particular 


solution of (2.17). 
Define the kernel (or null space) of matrix ® by 


ker(®) = {a € R” : Ox = 0}. (2.21) 


Note that ker(®) is a linear subspace in R”, that is, if £1, @2 € ker(®), then 
a,x +azx € ker(®) for any a1, a2, E€ R. 
Then, we introduce the dimension theorem in linear algebra. 


Theorem 2.5 (dimension theorem). For any matrix ® e R”*", 
rank(®) + dim ker(®) = n (2.22) 


holds. 
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From the dimension theorem, the dimension of ker(®) isn — m. Sincen > m, 
the kernel, which is a linear subspace in R”, has at least one dimension. That is, 
there exist infinitely many vectors in ker(®). Then, all solutions of the linear equa- 
tion (2.17) can be represented by the sum of a particular solution xg and a free 
parameter z € ker(®), that is, 


£z = z0 +z, ze€ker(®). (2.23) 
From this it follows that there exist infinitely many solutions of (2.17). 


Exercise 2.6. Show that the vector x in (2.23) is the solution of the 
equation (2.17). 


The problem of sparse representation (Problem 2.4) is to find a solution x of 
(2.17) that has the simplest representation, or the smallest number of non-zero 
elements. Let us consider this problem more precisely in the next section. 


2.3 The £? Norm 


We here review the notion of a norm in a finite-dimensional vector space, and then 
introduce the £? norm that defines the sparsity of a vector. 
First, let us recall the definition of a norm in R”. 


Definition 2.7. A norm ||x|| : R” — [0, 00) is a nonnegative function that satisfies 
the following properties: 


1. For any vector x € IR” and any number a € R, ||ax|| = |a|||x|l. 
2. Foranyx,y € R”, |æ + yll < llell + lly. 
3. læl =O — > z=0. 


A well-known norm in R” is the £? norm (or the Euclidean norm). For a vector 
x£ = [x1, X2, ..., Xn]! € R”, the £? norm is defined by 


læll2 A [Rtx ta. (2.24) 


The £? norm is also given by 


læl2 = y (z, £), (2.25) 


where (-, -} is the €? inner product (or Euclidean inner product) in R”, defined by 


(a,y) Sy'e@= > xy. (2.26) 
i=l 
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Exercise 2.8. Confirm the £? norm ||2||2 defined in (2.24) satisfies the three prop- 
erties in Definition 2.7. 


Not only the £? norm, we can define (infinitely) many norms for R”. A gener- 
alization of the £? norm in (2.24) is the €? norm with p € [1, 00), defined by 


n 1/p 
iels bs nt?) (2.27) 
i=1 


The most important norm in this book is the €! norm with p = 1 in (2.27). The 
£! norm is described as the sum of the absolute values of the elements in a vector, 
that is, 


n 
lel = >o ll: (2.28) 
i=l 


The limit of (2.27) as p =>> œ is called the £% norm (or the maximum norm), 


defined by 


[tlloo = max Įxil. (2.29) 
n 


i= 


jt 
N 


Exercise 2.9. Prove that for any x € R”, 
lz lloo = lim |x| p- (2.30) 
p00 


Figure 2.2 shows the contour curves that satisfy ||x||, = 1 for p = 1,2, and 
oo in R*. The contour of the £? norm is a unit circle centered at the origin. The 


Tı 


Figure 2.2. Contour curves (l|ællp = 1) of £!, €, £% norms. 
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contour of the £% norm is a unit square centered at the origin, and touches the £? 
circle at (1,0), (0, 1), (—1, 0), and (0, —1). The shape of the contour of the £! 
norm is very important for sparse representation. This diamond-shaped contour 
has four corners on the x; and x2 axes. This property gives an intuitive explanation 
of the relation between £! norm and sparsity (see Section 3.2). 

Now, let us define the £? norm. Consider a vector Œ = [x1], X2, ..., Xn]! € R”. 
Define the support of x by 


supp(@) = {i € {1,2,..., n}: x; #0}. (2.31) 


The support of æ is the set of indices on which the elements of æ are nonzero. By 
using the support, the €° orm is defined by 


lælo = #(supp(a)), (2.32) 


where #(supp(a)) is the number of elements in finite set supp(x). Namely, the € o 
norm counts the number of nonzero elements in x. 

It is notable that the £? norm does not satisfy the first property in Definition 2.7. 
For example, a nonzero vector 2 € R” has the same £° norm as 2a. This implies 
that 


l2ælo = llallo F 2llællo, (2.33) 


whenever æ Æ 0. Strictly speaking, the €° norm is not a norm, and hence we 
sometimes call it as £? pseudo-norm or cardinality. However, we use the term “£? 
norm” as often used in the literature. Note that by definition, the second and third 
properties in Definition 2.7 hold, that is, 


læ + yllo < llxllo + Ilyllo (2.34) 
and 
lelo =O — > x=0. (2.35) 


Finally, we define the sparsity of a vector by using the £? norm. A vector æ € 
IR” is said to be sparse if the £? norm ||a||o is sufficiently small compared to the 
dimension n. The notion of sparsity is important in this book. 


Exercise 2.10. Prove that for any x, y € R”, 
lz + yllo < llxllo + Ilyllo (2.36) 


holds. 
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Exercise 2.11. Let x, y € R”. When does the following equality hold? 
læ + yllo = llzllo + llyllo. (2.37) 


The problem of sparse representation Problem 2.4 is finding the sparsest solution 
among infinitely many solutions of the linear equation in (2.17). This problem is 
mathematically formulated by using the £? norm introduced above. That is, we 
seek the smallest £°-norm solution for (2.17). This is formulated as a mathematical 
optimization problem as follows: 


Problem 2.12 (Sparse representation). Given a vector y € R” and a full-row- 
rank matrix ®© € R”*" withm < n. Find the optimizer x* of the optimization 


problem: 


minimize ||a||9 subject to Ox = y. (2.38) 
xeR” 


We call this the €° optimization. 


2.4 Exhaustive Search 


In this section, we show a direct method to solve the £° optimization problem 
(2.38), called an exhaustive search (or brute-force search). Let pi € R” (i = 
1,2,...,m) denote the i-th column vector in matrix ®, that is, 


®-2[¢1 pr ... dnl eR”. (2.39) 


The following shows the procedure of the exhaustive search for (2.38). 


1. Ify = 0, then output 2* = 0 as the optimal solution and quit. Otherwise, 
proceed to the next step. 
2. Find a vector æ with ||a||9 = 1 that satisfies the equation y = Oa. That 


is, set 
X1 ' 0 
0 be i 
Tı 4 z > £2 4 0 serey Ln £ : (2.40) 
: . 0 
0 0 Xn 
and search x; € R (i = 1,2,..., 7) that satisfies 


y= Ox; = xi Qj. (2.41) 
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If a solution exists for some i, output 2* = a; as the solution and quit. 
Otherwise, proceed to the next step. 
3. Find a vector æ with ||a||9 = 2 that satisfies the equation y = Ox. That 


is, set 

X 

xX] A 0 

x2 
X3 

A A A 
1,2 = 0 ? X13 = 0 eee Ln—1,n = 0 (2.42) 
Xn-1 


= 
= 


and search x;, xj € R (i, j = 1,2,...,n) that satisfies 
y = Dzi, j = x16) + xj Oj. (2.43) 


Ifa solution exists for some i, j, then output x* = æ;, j and quit. Otherwise, 
proceed to the next step. 
4, Do similar procedures for ||a@||o = k, k = 3,4,...,m. 


By this exhaustive search, you can obtain the optimal solution æ* (if it exists) 
with finite number of steps (the worst case is k = m, where the optimal value is 
læ*llo = m). 

Next, we investigate the exhaustive search in detail. For a vector æ = 
[x1 X2, ..., Xn]! andan index set $ C {1,2,..., n}, we denote by £s € R#CS) 
the restriction of æ to the indices in S, where #(S) is the number of elements in S. 
For example, for æ = [x1, X2, X3, X4, X5, x6]! and S = {1, 2, 5}, we have 


X1 
TtTs=|xn]j|E RÌ. (2.44) 
X5 
More generally, for æ = [x1, X2, ..., xn]! and the index set 
S = {ij,i2,...,ic}, ke {1,2,...,n}, (2.45) 
where 1 < ij <i? <+:: < ip < n, we have 
Xi 
s 
as=| | eR. (2.46) 
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Also, for matrix ®© = [d@1,¢2,...,¢n] € R”*”" with ġ; €e R”,i = 
1,2,...,m and the index set in (2.45), we define 


Os = [di,» Pins sr te aig Pi; | = RB, (2.47) 


Using this notation, we can formulate the exhaustive search algorithm for the £? 
optimization (2.38) as follows: First check if y = O. In this case, the solution is 
a* = 0. Otherwise, take each subset S of the index set {1, 2,...,} from #(S) = 
1 to #(S) = m, and solve the following equation 


y = srs. (2.48) 
If there is a solution of (2.48), then using the solution ws = [Xj,,..., Xill, set 
g” = [r te x]! where 
Xi, LES, 
1s í (2.49) 
0 igs. 


This is the sparsest solution and we have ||æ*||o = k. We summarize the exhaustive 
search algorithm. 


Exhaustive search algorithm for £? optimization (2.38) 


1. Ify = 0 then output x* = 0 and quit. Otherwise, proceed to the next 
step. 

2. k:=1 

3. For each subset § C {1,2,..., n} with #(S) = k, do 


e Check if equation y = Dsavs has a solution. 
e If it exists, output æ* as in (2.49) and quit. 


4. k := k + 1. Return to 3. 


We should notice that with the exhaustive search method the computation time 


to find a solution grows exponentially with problem size (i.e. m). For example, in 
image processing, the dimension becomes millions or larger, and the exhaustive 
search is not useful at all. 


Exercise 2.13. For the optimization problem (2.38) of size m, compute the num- 
ber of iterations at the worst case where the optimal solution æ* has its £° norm as 
llælo = m. Then, let m = 100. Suppose that you can use a supercomputer that 
can do one iteration of the exhaustive search algorithm in 10~!> seconds. Compute 
the total time needed to do the exhaustive search at the worst case. 
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The above problem is also known as combinatorial optimization, which is in gen- 
eral hard to solve for large-scale problems. In the following chapters, we investigate 
efficient algorithms for such a hard problem of sparse optimization. 


2.5 Sparse Representation for Functions 


In this section, we discuss sparse representation for functions. 

Let us consider the function space L?(0, T), the space of all square integrable 
functions on (0, 7) in the sense of Lebesgue. That is, for any f € L?(0, T), the 
L? norm is finite: 


T 
flo ê f If (Pat < oo. (2.50) 


In this space, we can define the L? inner product 


T 
(fey f Fewer, (2.51) 


where g(f) is the complex conjugate of g(t). It is well-known that under the L? 
inner product, the space L? becomes a Hilbert space. 
Then let us consider an orthonormal basis {¢; : i € Z} in L?(0, T) that satisfies 


L ifi=j, 
(Qi, Oj) = 6ij = | se (2.52) 


0, otherwise. 


Then, for any function f € L?(0, T), there exist a complex sequence {aj : i € Z} 
such that 


f= >> agi, (2.53) 


i=—oo 


where the convergence is in the sense of L?, that is, 


N 
r- ai 
i=—N 


=> 0, (2.54) 
2 
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as N — oo. The representation (2.53) is called Fourier series’ of f. Given f and 
{9i}, the coefficients are obtained by the inner product 


T ———— 
b= e | Fh Wat. (2.55) 


Exercise 2.14. Prove that (2.55) holds. 


A standard basis for L?(0, T) is the Fourier basis defined by 
1. 
i(t) = —-e"", ieZ (2.56) 
Pit) Tt 


where j = /—1 and œ; = 2ai/T. With this basis, the coefficients in (2.55) are 


given as 


T — T l 
Qi = val fetdt = Tok fe dt. (2.57) 


For a sufficiently smooth function, the Fourier basis gives a good solution to rep- 
resent the function with a finite number of coefficients by truncation. That is, we 
approximate function f as 


N T 
fyv= >) ai6i, a1 = a [ fOe dt. (2.58) 


i=—N 


Actually, this is optimal in the sense that fy minimizes the L? error 


N 
En(B-n,---,Bv) =| f — >) Bid (2.59) 
i=—N 2 
among all coefficients {B_y,..., By}. 
Now, let us consider a rectangular function on L?(0, 1) defined by 

1, te€(0,1/2), 
f= ie (2.60) 

—1, #2 24), 


Figure 2.3 (left) shows this function. We can see that this function is discontinu- 
ous. The Fourier coefficients of this function can be easily computed using (2.57). 


3. This is also called as generalized Fourier series. Then, with the standard Fourier basis in (2.56), the series in 
(2.53) is called the Fourier series. 
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Fourier Coefficients 


1 L 1 0 e 
0 0.2 0.4 0.6 0.8 1 -20 -15 -10 6 0 
i 


Figure 2.3. Discontinuous rectangular function f(t) (left) and absolute values of its 
Fourier coefficients (right). 
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Figure 2.4. Truncated Fourier series fy (t) with N = 20 (left) and N = 100 (right). 


In fact, we have 


-2 ifiisodd 
gat eee’ (2.61) 
0, otherwise. 


Exercise 2.15. Show that the Fourier coefficients of the rectangular function in 
(2.60) are given by (2.61). 


Figure 2.3 (right) shows the absolute values of the coefficients with i = —20 
to 20. We can see that the coefficients converge to zero as i goes to +00. Actually, 
from (2.61), the coefficient sequence {aj} converges to zero as |i] —> o0 with 
convergence rate O(1/7). 

This fact suggests us to truncate the coefficients with N to obtain the approxi- 
mant fy (t) in (2.58). Figure 2.4 shows the truncated Fourier series fy (t) in (2.58). 

The left figure is fy (t) with N = 20. We see oscillations around the edges of 
the rectangular function. When we increase N to N = 100, we obtain another 
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p(t) 


Figure 2.5. Haar functions y1,y2, w3, and wa. 


oscillative function in the right figure of Figure 2.4. This oscillation never dis- 
appears around the edges for arbitrarily large but finite N. This is called Gibbs 
phenomenon. To store the information of the rectangular function as Fourier coef- 
ficients, you cannot truncate it but store all of the coefficients. 

Let us consider another orthonormal basis in L(0, 1) called Haar basis defined 
by Haar functions 


yi(t) = 1, (2.62) 
and fori = 2" +k, k =1,2,...,2",m=0,1,2,..., 
2m, te [2-™(k—1),2-""! (2k — 1)), 
wilt) = 4—V2™, te [27i (2k-— 1), 27k), (2.63) 
0, otherwise. 


Figure 2.5 shows Haar functions y1,y2, Y3, and y4. 

Then, if we adopt a redundant dictionary of bases consists of the Fourier basis 
in (2.56) and the Haar basis. From this dictionary, we can simply represent the 
rectangular function in (2.60) as 


f(t) = w6). (2.64) 


That is, we need to store just one coefficient under the redundant basis. This is 
the motivation to use a redundant dictionary and to obtain sparse representation 
for functions. As shown above, sparse representation of functions is to sparsify the 
coefficients in the Fourier series of a given function. 


2.6 Further Readings 


The notion of redundant dictionary and sparse optimization described in this chap- 
ter is fundamental and important in this book. The redundant representation of 
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vectors is related to frames and wavelets, for which readers can refer to nice books 
by Strang and Nguyen [107] and by Mallat [72]. For fundamental theory of vector 
spaces, called functional analysis, including norms, inner products, orthonormal 
bases, and Fourier series, I recommend books by Young [115] and by Yamamoto 
[114], which are written for scientists and engineers. 


DOI: 10.1561/9781680837254.ch3 


Chapter 3 


Curve Fitting and Sparse Optimization 


In this chapter, we study curve fitting to obtain a curve, or a function, from given 
data, and how sparse optimization effectively works for this problem. 


Key ideas of Chapter 3 


e Curve fitting is formulated as an optimization problem to choose one 
solution among (infinitely many) candidates. 

e Regularization is used for avoiding overfitting. 

e Sparse optimization is reduced to £! optimization, which is convex and 

efficiently solved by numerical optimization. 


3.1 Least Squares and Regularization 


We begin with the least squares and regularization with simple examples. 


3.1.1 Underdetermined System and Minimum €2-Norm Solution 


Let us consider the linear equation 
r =y, (3.1) 
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where y € R” is a given vector, ® € R”*” is a given matrix, and x € R” is an 
unknown vector. We here assume m < n, and ® has full row rank, that is, 


rank(®) = m. (3.2) 


Under these assumptions, there exist infinitely many solutions of the equation (3.1). 
Let us find the smallest £?-norm solution among them. This is formulated as an 
optimization problem 


1 
minimize —||a|5 subject to a = y. (3.3) 
xeR” 2 


We call this problem the £? optimization problem, and the solution the minimum 
€?-norm solution. 

To solve this problem, we can use the method of Lagrange multipliers. First, 
we define the Lagrange function, or simply Lagrangian, of the optimization prob- 
lem (3.3) by 


L(x, X) = sole + A| (Ox — y). (3.4) 


The variable A € R” is called the Lagrange multiplier. 

Then, we can obtain the optimal solution of (3.3) by finding the stationary 
point (a*, A*) of the Lagrange function L. By differentiating L by the variable æ, 
we have 


ôL (l 
S| Sele eal be) =xrz+0 A. (3.5) 
Ox ox \2 


It follows that the stationary point (a*, A*) satisfies 
a*+0'A*=0. (3.6) 


Then differentiating L by A gives 


= © (3.7) 
— = £ — Y, x” 
aN Y 

and hence 
Oxr* —y=O. (3.8) 


From this and (3.6), we have 


— ODl A* =y. (3.9) 
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Since ® has full row rank, the matrix PO! is nonsingular and has its inverse. 


Therefore, from (3.9) we have 

AN = -(00')7!y, (3.10) 
Assigning this to (3.6) gives the minimum ¢?-norm solution x* as 

x* = 0'(00!) ly, (3.11) 


In summary, if we are given a full-row-rank matrix ® and a vector y, we can 
compute the minimum €?-norm solution by the formula (3.11). 


Exercise 3.1. Find the minimum ¢?-norm solution of the following equation with 
unknowns x, and x2: 


axı taox2 = 1, (3.12) 
where a; and a2 are nonzero real numbers. 


Exercise 3.2. Let ® e R”*”, Prove that PØ! is invertible if ®© has full row rank. 


3.1.2 Regression and Least Squares 


Suppose we are given two-dimensional data 


D = {(t1, y1), 2, y2), -- -> (tins Ym)}- (3.13) 


Let us consider a polynomial of order n — 1, 
Y = f (t) = ant"! + ant"? + -+ + ait + a0. (3.14) 


Curve fitting is to find coefficients ag, 41, . . . , @n—1 with which the polynomial 
curve has the best fit to the m-point data (see e.g. Figure 3.1). For example, 
ti, t2, .. . , tm are sampling instants, and y1, y2,..., Ym are temperature data from 
a sensor at a portion. From these data, we often want to know the curve behind the 
data. We call such data analysis the regression analysis or polynomial curve fitting. 


ty t2 t3 t4 t5 


Figure 3.1. Interpolating polynomial. 


34 Curve Fitting and Sparse Optimization 


First, we consider an interpolating polynomial that interpolates the given 
data as shown in Figure 3.1. The polynomial curve (3.14) goes through the 
data points (3.13), and hence we have m linear equations with unknowns 
dn-—-1;, n-2, . . ., 41, a0: 


=i = 
Gy—ith | + an-2t +--+ aiti + a0 = y1, 
n—l1 n—2 
an—-1ĥ) +an-2ħ) “+--+ + ajt2 +40 = Yo, 
2 2 (3.15) 
n—l1 n—2 — 
An—-1\ty = + n—2ty +: + atm + a0 = Ym. 
Define a matrix 
n—-1 n—2 
fa, t 1l 
7 t” to 1 
2 2 
pE- , , eR”, (3.16) 
E tm 1 
and vectors 
H yı 
an—2 
A ` n A y2 m 
r2 eR”, y=]. | eR”. (3.17) 
al 
ao Ym 


Then the system of linear equations (3.15) can be represented in a matrix form: 
x = y. The matrix ® is known as a Vandermonde matrix, and ifm = n, then ® 
is a square matrix and its determinant is given by 


de®) = [| G-t) = at- t) m-i tm) 6.18) 


l<i<j<m 
It follows that if 
ti A tjs Vi, jst. i Æj, (3.19) 


then © is nonsingular and has its inverse. Hence, the solution æ* of (3.15) is given 
by using ©! as 


x* = Oly, (3.20) 
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In summary, if one choose an (m — 1)-th order polynomial for m data points that 
satisfy (3.19), then the coefficients of the interpolating polynomial can be uniquely 
obtained by the formula (3.20). 


Example 3.3. Let us consider the following data. 


tl 2 3... 14 15 
y|2 4 6 ... 28 30 


The data are obtained from a linear relation y = 2t. By using these 15 data 
points, we find a 14th-order interpolating polynomial. Now, we use a useful com- 
putational software, MATLAB, to compute the matrix inversion in (3.20). The 
following is a code to obtain the coefficients. 


MATLAB code for the coefficients of the interpolating polynomial. 


%% Data 

t = 1:15; 

y=2*t 

%% Vandermonde matrix 

Phi = vander(t); 

%% Coefficients of interpolating polynomial 
x = inv(Phi) * y’; 


In this code, vander is a MATLAB function to compute the Vandermonde matrix 
in (3.16). The vector variable y computed in the 3rd line is a row vector, and we 
should transpose it in the last line (i.e. y’). 

Running this code we obtain 


x= 
2.274746684520826e-24 
-5.565271161256770e-21 
9.137367918505765e-19 
-9.452887691357992e-18 
-3.658098129966092e-16 
-1.608088662230500e-15 
3.569367024169878e-14 
-6.021849685566849e-13 
5.346834086594754e-13 
-1.267963511963899e-11 
4.878586423728848e-11 
2.088995643134695e-12 
1.366515789413825e-10 
1.999999999995282e+00 
-4.014566457044566e-12 
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Figure 3.2. 14-th order interpolating polynomial with noisy data. 


This result shows that the second value from the bottom is nearly 2, and the 
other values are almost zero. That is, the coefficients are given by aj = 2 and 
ai = 0 fori # 1, and the interpolating polynomial is y = 2t. This is the right 


solution. 


Usually, data include noise. Let us add Gaussian noise with zero mean and vari- 
ance 0.5? to the data y in Example 3.3, and find the interpolating polynomial. 
The obtained curve is shown in Figure 3.2. The interpolating polynomial exactly 
goes through the data points, but the curve is significantly affected by noise, and 
very different from the original relationship y = 2t. Such a phenomenon is called 
overfitting. 

The reason of overfitting is that the order of the polynomial is too high. If we 
previously know that the original curve is of first order, then we can assume a 
first order polynomial (i.e. a line) y = aıt + do, and find the coefficients ag 
and a; with which the line has the best fit to the data. If the data is noisy, it 
is obviously impossible to obtain a line that goes through all the data points. 
However, it is not a problem at all if the line does not interpolate the noisy 
data. 

Now, let us reformulate our problem of curve fitting for noisy data. We measure 
the distance between the polynomial and the data points by the £? norm (Euclidean 
norm). For noisy data, we do not require the curve to go through the data points 
since it is in general impossible. We find the curve that is as close to the data points 
as possible. The optimization problem is described as follows: 


1 2 
inimize —||Ox — : 3.21 
ie zl ylls ( ) 
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where ® e R”*” is a Vandermonde matrix defined in (3.16). We call the opti- 
mization in (3.21) the least squares method. If we assume n < m, that is, if the 
order of the polynomial is less than m — 2, then the number of unknowns is 
less than that of equations. In this case, the matrix ® is a tall matrix, and the 
equation Ox = y has no solution in general. If the condition (3.19) holds, 
it is easily shown that the solution of (3.21) uniquely exists. In fact, if (3.19) 
holds, then the Vandermonde matrix ® has full column rank. Note that a matrix 
® e R”*" is said to have full column rank if the n column vectors in © are 
linearly independent. In other words, ® e R”*” has full column rank iff ® is 
injective, or 


rank(®) =n. (3.22) 
Then the unique solution of the optimization problem in (3.21) is given by 
x* = (D D) lo] y. (3.23) 


We call this the Zast squares solution. As the minimum £ 2 norm solution in (3.11), 
the least squares solution is also given by a closed form. 


Exercise 3.4. Prove that the solution of the optimization problem (3.21) is given 


by (3.23). 
Exercise 3.5. Let @; denote the i-th column vector in matrix ® € R” *”, that is, 
= [i Q2 see Pn] . (3.24) 


Then define the residual between the data y and the optimal estimation Pæ* with 
(3.23) by 


r Êy- Oa". (3.25) 
Prove that the residual satisfies 
(Qir) =0, Vie{1,2,..., n}. (3.26) 
Also, by using this fact, show that the residual r is orthogonal to Oa". 


Example 3.6. Let us consider Example 3.3 with additive Gaussian noise with zero 
mean and variance 0.57. We assume the curve is a first-order polynomial modeled 
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by y = aıt + ao. A MATLAB code to obtain the least squares solution of this 
problem is given as follows: 


MATLAB code for least squares solution 


%% Data 

t= 115: 

y=2* t+ randnd,15)*0.5; 
%% Vandermonde matrix 
Phil5 = vander(t); 

Phi = Phi15¢:,14:15); 

%% Least squares solution 
x = inv(Phi’ * Phi) * Phi’ * y’; 


In this code, randn1,15) isa MATLAB function that returns normally distributed 
random numbers (i.e. Gaussian noise) with zero mean and variance 1 of size 1 x 15 
(i.e. a row vector). The matrix variable Phi15 is a Vandermonde matrix of size 15 x 
15, and in the 6th line we extract 14th and 15th columns, which are related to 
coefficients a, and ag, to make matrix Phi of size 15 x 2. The result is shown 
below. 


x= 


1.985404378030957e+00 
1.359049380398556e-01 


Figure 3.3 shows the line y = aıt + do with these coefficients. While the 14-th 
order interpolating polynomial implies overfitting, the least squares line shows a 


good result. 


3.1.3 Regularization 


As we have discussed in the previous section, one can avoid overfitting by the least 
squares method with an appropriate order of the polynomial, which is less than the 
number of data points. However, what can we do if we do not know the proper 
order in advance? In this case, we can adopt regularization. Let us begin with a 
simple example. 


Example 3.7. Suppose that we are given a data set 


D = {(t1, y1), (t2, y2), - -~ , (tins Ym)}» (3.27) 
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Figure 3.3. Least square solution (solid line) and 14-th order interpolating polynomial 
(dashed curve). 


which is generated from a sinusoid y = sin(t). We consider sampling instants 


ti =0, 2=1, t =2,...,t%, = 10, (3.28) 


and the points y1, y2,..., Y11 are obtained as 
yi = sin(ti)+ éi, i=1,2,...,11, (3.29) 


where €; is Gaussian noise with zero mean and variance 0.2? added at time t; inde- 
pendently. The following table shows the obtained data. 


ti 0 1 2 3 4 5 

yi | —0.0343 1.0081 0.8326 0.4047 —0.7585 —0.9285 
ti 6 7 8 9 10 

yi | —0.2110 0.6626 0.8492 0.2761 —0.6962 


Figure 3.4 shows the data points and the original sinusoidal curve. 

For these data, let us find a 10-th order polynomial that interpolates the data 
points by using (3.20). Figure 3.5 shows the result. Affected by the noise, the curve 
is very oscillative and shows overfitting. We then take a 6-the order polynomial 
and compute the least squares solution by (3.23). Figure 3.6 shows the result. 
From this figure, we have a better fit than the interpolating function shown in 


Figure 3.5. 


In the above example, the order 6 was chosen by computing curves of all orders 
from 1 to 10, and comparing the reconstructed curve with the original sinusoid. 
However, this can be done ¿f we previously know the original sinusoid. This is impos- 
sible in real applications. That is, we do not know the optimal order just from the 
data in advance. 
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Figure 3.4. 11 data points from a sinusoid (dashed curve). 


0 2 4 6 8 10 
Figure 3.5. 10-th order interpolating polynomial (solid curve) and the original sinusoid 
(dashed curve). 


0 2 4 6 8 10 
Figure 3.6. Least squares solution with 6-th order polynomial (solid curve) and the orig- 
inal sinusoid (dashed curve). 
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To see the difference between the 10-th order interpolating polynomial and the 
6-th order least squares polynomial, we compare their coefficients. Let us denote 
by 219 and a6, respectively, the 10-th and 6-th order polynomials. They are 
obtained as 


—0.0343 
16.2400 
—38.0984 —0.0260 
37.8369 1.0636 
—20.2842 0.3067 
£10 = 6.5035 |, «w= | —0.5225 |, (3.30) 
—1.3100 0.1426 
0.1677 —0.0146 
—0.0133 0.0005 
0.0006 
—0.0000 


where the boldface numbers are the largest three elements in their absolute values. 
We can observe that the boldface values in a9 are much larger than those in a6. 
This is a cause of oscillation in the 10-th order interpolating curve. 

From the above observation, we try to minimize both the squared error || Oa — 
yll5 and the squared £? norm Eales of the coefficient vector æ at the same time. 
This is formulated as the following optimization problem: 


inimi at i+ 2 I3 (3.31) 
mMıInNniMIz — T= — T a z 
xeR” s 2 yi2 2 2 


We call this optimization the regularized least squares, or ridge regression. The addi- 
tional term 4 alls is called a regularization term, and the parameter À a regulariza- 
tion parameter, which control the balance between the error in curve fitting and the 
{2 norm of the coefficients. 

As in the least squares solution, the solution of the regularized least squares in 
(3.31) can be obtained in a closed form: 


x =(1+0'o) o'y. (3.32) 
Exercise 3.8. Prove that the solution of (3.31) is given by (3.32). 
Example 3.9. Here we consider an example of regularization. With the data given 


in Example 3.7, we compute a 10-th order polynomial by the regularized least 
squares. We take the regularization parameter A = 1, and compute the solution æ* 
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S 


0 2 4 6 8 10 
Figure 3.7. Regularized least squares solution with 10th-order polynomial (solid curve) 
and the original sinusoid (dashed curve). 


by the formula (3.32). The obtained coefficients are as follows: 


0.1448 
0.2691 
0.1865 
0.0769 
—0.0334 
x* =| —0.0674 |. (3.33) 
0.0386 
—0.0085 
0.0010 
—0.0001 
0.0000 


The boldface values are the three largest elements in the absolute values. Compared 
with the coefficients #19 in (3.30) of 10-th order interpolating polynomial, the 
values in 2* are much smaller. The curve with the regularized least squares is shown 
in Figure 3.7. We can see that the 10-th order polynomial by the regularized least 
squares shows a comparable accuracy to the 6-th order least squares solution. 


3.1.4 Weighted Ridge Regression 


Here we further consider the problem of polynomial interpolation. In the regular- 
ized least squares, we minimize the cost function in (3.31) with the regularization 
term alee This is to make the coefficient vector æ not so large. Instead of this, we 
consider the L? norm of the polynomial f(t). The L? norm of a function f(t), 
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t € [ti, tm] is defined as 
tm 
Ifl = / FOP dt. (3.34) 
Jti 
Since f(t) is a polynomial 
n—-1 
TO=} at’, (3.35) 
i=0 


the L? norm can be computed as 


tm {a1 n—-1 
, | | 
ifthe = f (Zat) (Dare! | a 
i o j=0 


t 


n—-ln-1 tn 
-SS aa; f” itia 
LaLa J, (3.36) 
n—ln-l1 i+j+1 i+j+l 
tm = 4, 
= > S aiaj ; + i + 1 
i j=) PTJ 
=a" Ox, 
where Q = [Q;j] is a matrix defined by 
ftitl _ iti 
m 1 i 
ra , 17 =0,1,...,n—-1. 3.37 
Qij ZFA J ( ) 


Now, from the definition of L? norm, we have || f ||z2 > 0 for any polynomial f, 
and ||fllz2 = 0 if and only if f = 0. This means that for any x € R”, we have 
x' Qx > Oand a! Qg = 0 ifand only if x = 0. That is, the matrix Q is positive 
definite. 
Now, we consider a regularization problem minimizing 

1 a) 1 A 2 

sla — yl + FIFI = sla — yl? + zl Pel, (3.38) 
where is a matrix that satisfies Q = YTY. This is called the weighted ridge 
regression. The solution is obtained by 


a =(0'O+1¥'Y)!oly. (3.39) 
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Table 3.1. Summary of optimization problems with £? norm. 


Problem Size Problem Solution 
min £? norm m<n minzllæl s.t. P£ = y o'(o@!)ly 

least squares (LS) m >n minz |x — yll5 (Dld) lol y 
regularized LS any min3|| Ox —yll5+4ials DTI +D ly 


=(1+0'o®)'oly 


Exercise 3.10. Prove that (3.39) is the solution of the optimization problem mini- 
izing (3.38). 


3.1.5 Summary of -Norm Optimization 
Now we summarize the curve fitting problem by (m — 1)-th order polynomial 
y = ft) = anit” | + ag_ot” ? +--+ +. ait + a0, (3.40) 
with data 
D = {(t1, y1), 2, Y2); - - +s Cm, Ym)}- (3.41) 
Table 3.1 shows the summary. 


Exercise 3.11. Prove that for any matrix ® e R”*” and any number 4 > 0, 
matrices AJ + ØO! and AJ + OTO are invertible and satisfy 


o'Q1+00')!'=(1+0'o0)'o!', (3.42) 


3.2 Sparse Polynomial and ¢!-norm Optimization 


Here we consider yet another example of curve fitting. Let us consider an 80-th 
order polynomial 


y= ap 4r. (3.43) 
From this polynomial, we sample data points with sampling instants 
ti =0,f =0.1,t3 =0.2,...,%, =1, (3.44) 
to obtain 


D ={(t, y1)s (2, 92)5--- ty}, yi = OBO + ti. (3.45) 
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Figure 3.8. Sparse polynomial y = —18° + t and sampled data. 


Figure 3.8 shows the curve of the 80-th polynomial in (3.43) and the generated 
data in (3.45). 

We assume that the order of the original polynomial is previously known to be 
at most 80. Then, can we reconstruct the original curve in (3.43) from the data 
D? In this case, there are infinitely many interpolating polynomials with order at 
most 80 that go through all the data points. In fact, the Vandermonde matrix ® 
in (3.16) is a fat matrix of size 11 x 81, and since the condition (3.19) holds, ® 
has full row rank and there exist infinitely many solutions of the linear equation 
x = y, where æ is a column vector consisting of 81 unknown coefficients, and 
y is a column vector consisting of data y1, y2,..., y11. As mentioned in Section 
3.1, we need additional proofs to obtain the unique solution. 

Let us look again at the original 80-th order polynomial in (3.43). The coef- 
ficients of this polynomial are all zero but two coefficients. In other words, the 
coefficient vector © = (ago, 479, ..., 0) is sparse, that is, æ has small €° norm. 
We call such a polynomial a sparse polynomial. We assume that the following fact 
can be additionally used as our proof. 


The original polynomial is sparse. 


Note that we can use the sparsity property of the original polynomial but the num- 
ber of non-zero coefficients (i.e. ||a||9) is assumed to be unknown. 

Borrowing the idea of the optimization mentioned in Section 2.3, we use the £? 
norm as the cost function, and consider the following optimization problem: 


minimize ||a||9 subject to Bx = y. (3.46) 
xeR” 


As mentioned in Section 2.4, this is quite hard to solve using the exhaustive search 
method when the problem size is large. 
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The key idea of sparse optimization is to use the £! norm 


lel = >) Bil (3.47) 


i=l 


instead of the €° norm. That is, we consider the following optimization problem 
as relaxation of the £? optimization (3.46): 


minimize ||x||; subjectto Ox = y. (3.48) 
xeR” 


We call this optimization the £! optimization. The method to obtain a sparse vector 
by the £! optimization is known as the basis pursuit. 

The £! optimization problem in (3.48) is to find the smallest £!-norm vector on 
a linear subspace {a € R” : Ox = y}. As illustrated in Figure 2.2 in Chapter 2, 
the contour of the £! norm (||a|1 = c) is a diamond whose corners are on the axes. 
The optimal solution of (3.48) is obtained (in the 2-dimensional case) by enlarging 
the contour ||a||] = c from c = 0 until the contour touches the linear subspace 
{a e R? : Ox = y}. As shown in Figure 3.9, the linear subspace {a € R? : 
x = y} touches almost surely the £! contour on one of its corners. Since each 
corner is on one of the axes, the optimal solution satisfies ||æ* lọ = 1 < 2, and 
hence it is sparse. This is an intuitive explanation why minimizing £! norm gives a 
sparse solution. 

The relation between the £? and £! norms is intuitively understood as follows. 
By definition, the £? norm can be rewritten as 


n 


islo = >æ’, (3.49) 


i=l 


Figure 3.9. /! optimization in R?: the contour {æ : ||a||; = c} touches the linear subspace 
{x : Ox = y} on one of the corners that are on axes. 
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Figure 3.10. Relation between |x| and |x|°. 


where |x|? = 1 if x 4 0 and 0° £ 0. Figure 3.10 shows the graph of |x|°. From 
this figure, it is easily seen that the £? norm is non-convex. On the other hand, the 
£! norm 


lel = $. lxil (3.50) 


i=l 


is a sum of absolute values |x;|, which is convex as shown in Figure 3.10. The £ l 
norm is the best convex approximation of the £? norm in the sense that it has the 
minimum exponent p = 1 among €? norms that are convex. Theoretically, the 
£! norm is also understood as the convex relaxation of the £? norm. That is, the £! 
norm is the second conjugate || - ||5* of || - Ilo. See [112, Section 1.3] for details. 

In the £! optimization problem in (3.48), the cost function |||; is a convex 
function of a, and the constraint set {x € R” : Ox = y} is a convex set. Therefore, 
the problem is a convex optimization problem,’ for which numerical optimization by 
using a computer can give a numerical solution much faster than the exhaustive 
search for the £° optimization in (3.46). 

To obtain a sparse vector, we can also use the idea of regularization for sparse 
optimization. In the case of noisy data, we can formulate the curve fitting as the £? 
regularization described below: 


1 
minimize 5|| bx — yll + Allallo. (3.51) 


xeR” 


Unfortunately, this optimization is also a combinatorial problem, and hard to 
solve if the problem size is large. Instead of the €° norm for the regularization term, 
we use the £! norm and consider the following optimization problem: 


1 
minimize =||®x — yll5 + Allal|s (3.52) 
xeR” 2 


1. For the mathematical definition of convexity and convex optimization, see Chapter 4. 
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This is called £! regularization, or LASSO (Least Absolute Shrinkage and Selec- 
tion Operator). The cost function in (3.52) is a convex function of æ, and hence 
the optimization is a convex optimization problem, which can also be solved very 
efficiently. 

In this section, we have introduced the important idea to approximate the £? 
norm, which is non-convex and discontinuous, by the £! norm, which is convex. 
A question is when the convex optimization problem (3.48), or (3.52) can give 
the solution of the original £° optimization (3.46), or (3.51). Very interestingly, 
in many applications (e.g. signal/image processing), the solution of the £! norm 
optimization is equivalent to (or sufficiently close to) the €°-norm solution. In 
fact, there exist many theorems for the equivalence between €° and £! optimiza- 
tions. From these facts, £! optimization is often said to be sparse optimization. 
For seeking sparse solutions, there also exist many methods other than £! opti- 
mization, for example, greedy methods, which we will see in detail in Chapter 5, 
or €?-norm optimization with p e (0,1). In the next section, we will show 
how to solve the £! optimization (3.48) or £! regularization (3.52) by using 
MATLAB. 


3.3 Numerical Optimization by CVX 


The optimization problems (3.48) and (3.52) are convex ones, and they can be 
efficiently solved by numerical optimization. We here introduce a well-known soft- 
ware for numerical convex optimization, CVX,* which is a free software running 
on MATLAB. 

Let us consider the 80-th order polynomial y = —189 + ¢ in (3.43). From 
this, we generate 11 data points as in (3.45), and we try to reconstruct the original 
polynomial from these data. 

First, we define the coefficient vector of the 80-th polynomial. The following is 
a MATLAB code to do this. 


%% Coefficient vector of the 80-th order polynomial 
x_orig = [-1,zeros(1,78),1,0]’; 


2. http://cvxr.com/cvx/ 
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Next, by using MATLAB function polyval that returns the value of the polyno- 
mial from the coefficient vector, we make a data set (3.45) as follows. 


%% Data 


t = 0:0.1:1; 
y = polyval(x_orig,t); 


With the data, we find the 10-th order interpolating polynomial. For this, we 
compute the Vandermonde matrix (3.16): 


%% Vandermonde matrix 
Phi = vander(t); 


From this, we compute the coefficients of the interpolating polynomial by using 


(3.20). 


%% Coefficients of interpolating polynomial (10-th order) 
x = inv(Phi) * y’; 


Note that since y is a row vector, we take its transpose y’. Now, let us draw 
the curve of the interpolating polynomial. We discretize the time axis into small 
intervals, and draw the curve. 


%% Draw curve 


time = 0:0.01:1; 
plot(time, polyval(x, time)); 


Figure 3.11 shows the result. In this case, the data are noiseless and there is no 
oscillation due to overfitting. However, we can see a large gap in the range from 
t=0.9tot=1. 

Then, let us compute a curve by the regularized least squares (3.31) with a 
10-th order polynomial. We choose the regularization parameter as 4 = 0.2, and 
compute the coefficient by the formula (3.32). 
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Figure 3.11. 10-th order interpolating polynomial (solid curve) and the original polynomial 
y = —t® + t (dashed curve). 
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Figure 3.12. 10-th order polynomial by regularized least squares (solid curve) and the 
original polynomial y = —18° + + (dashed curve). 


%% Regularized least squares 


lambda = 0.2; 
x = inv(lambda * eye(11) + Phi’ * Phi) * Phi’ * y’; 


Figure 3.12 shows the result. The obtained curve has a poor fit to the original 
curve y = =7° +t. 

Finally, we compute the curve by £! optimization (3.48). We assume that 
we know the polynomial order is at most 80. In this case, the Vandermonde 
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matrix (3.16) becomes a fat matrix of size 11 x 81. This matrix can be obtained as 
follows. 


%% Vandermode’s matrix 
Phi =[]; 


for m = 0:80 
Phi = [t’.°m, Phi]; 
end 


Define the coefficient vector x and the data vector y as in (3.17), then the con- 
dition for interpolation is described as 


Or = y. (3.53) 


We seek the sparsest solution in the linear subspace {a e R8! : Ox = y} by 
solving the £! optimization problem in (3.48). 

To solve the £! optimization problem, we use CVX on MATLAB. By using 
CVX, the optimization problem can be very easily coded. 


%% L1 optimization by CVX 
cvx_begin 
variable x(81) 
minimize norm(x, 1) 
subject to 
Phi * x == y’ 


cvx_end 


You should compare this code with (3.48). You can write a code very intuitively 
for an optimization problem. This is the strongest point of CVX. You can solve 
many convex optimization problems other than the £! optimization in a similar 
way. We recommend for beginners to use CVX to solve convex optimization prob- 
lems.” 

Let us draw the curve with the coefficients obtained by the £! optimization. 
Figure 3.13 shows the curve. We can see that the obtained curve is almost the same 
as the original curve y = —1°° + t. This is the power of £! optimization. The 
complete MATLAB program for £! optimization is shown below. Enjoy! 


3. CVX is also available in Python. See cvxopt.org for details. 
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Figure 3.13. 80-th order polynomial by £! optimization (solid curve) and the original poly- 
nomial y = —t® + t (dashed curve). 


3.4 Further Readings 


For least squares, regularization and overfitting in curve fitting, I recommend to 
read standard textbooks [8, 44] of machine learning. The least squares method 
is deeply related to projection and generalized inverse, for which you can choose 
a textbook of [43]. The mathematical theory and generalization of LASSO can 
be found in [15, 40, 44, 45]. For the equivalence theorems between €° and £! 
optimizations, refer to text books [37, 38, 112]. 


MATLAB program for the coefficients by £! optimization using CVX. 


clear 

%% Polynomial coefficients 
x_orig = [-1,zeros(1,78),1,0]’; 
%% Sampling 

t = 0:0.1:1; 

y = polyval(x_orig,t); 

%% Data size 

N = length t); 

M = N-1; 

%% Vandermonde matrix 
Phi_v = vander(t); 

%% Interpolation polynomial with order 10 
x_i = inv(Phi_v)*y’; 

%% LASSO 
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% Order of polynomial 
M_I = 80; 
% Design matrix 
Phi_l=[1; 
for m=0:M_| 
Phi_l = [t’.”m,Phi_l]; 
end 
% CVX 
cvx_begin 
variable x(M_I+1) 
minimize norm(x,1) 
subject to 

Phi_l*x == y’ 
cvx_end 
%% Plot 
tt = 0:0.01:1; 


figure; 

stem(t,y); hold on 
plotctt,polyval(x_orig,tt),’-’); 
plot(tt,polyval(x,tt)); 


DOI: 10.1561/9781680837254.ch4 


Chapter 4 


Algorithms for Convex Optimization 


In the previous chapter, we have seen that convex optimization such as £! optimiza- 
tion is efficiently solved by using CVX on MATLAB. Such a tool is actually very 
useful for small or middle scale problems. However, if you treat a very large-scale 
problem like image processing, CVX might be insufficient. Moreover, if you want 
to apply the sparsity method to control systems, you should compute sparse opti- 
mization in real time (e.g. in a few msec) with a cheap device on which MATLAB 
cannot be installed. In such cases, you should instead write an efficient algorithm 
by yourself for your specific problem. This means that you should look into the 
black box of the toolbox. 

For this purpose, we review basics of convex optimization, and introduce effi- 
cient algorithms for problems in sparse optimization. 


Key ideas of Chapter 4 


e In convex optimization, a local minimum is a global minimum. 
e £l optimization problems appeared in this book are convex optimiza- 


tion. 


e Proximal operators are used to derive fast algorithms for convex opti- 


mization with non-differentiable €! norm and constraints. 
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4.1 Basics of Convex Optimization 


We here review important facts in convex optimization. Let us begin with the def- 


inition of a convex set. 


Definition 4.1 (convex set). Let C be a subset of IR". C is said to be a convex set if 
the following inclusion 


te+(1—t)hyeCc (4.1) 
holds for any vectors x, y € C and for any real number t € [0, 1]. 


Figure 4.1 illustrates a convex set and a non-convex set (i.e. a set that is not convex). 
In convex set C, the line segment between any two points x and y in C lies com- 
pletely in C. On the other hand, in a non-convex set, there exists a line segment 
that partially lies outside of the set. 


Exercise 4.2. Prove that if C and D are convex, then C N D is also convex. 


In convex optimization, we often handle a function f : R” —> R U {oo}, 
which takes values on extended real numbers R U {00}. The following function is 


an example: 


pon he if jello < 1, a 


oo, if ||a|l2 > 1. 


This function is called an indicator function, which will be explained in Sec- 
tion 4.2.4. 
The effective domain of a function f is defined by 


dom(f) £ {a e R” : f(a) < oo}. (4.3) 


That is, the effective domain of a function f : R” —> R U {co} is a set in R” on 
which f takes finite real values. For example, the effective domain of the indicator 
function (4.2) is given by 


dom(f) = {x € R": æll2 < 1}. (4.4) 


Figure 4.1. Convex set (left) and non-convex set (right). 
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Figure 4.2. Convex function (left) and non-convex function (right). 


A function is said to be proper if its effective domain is non-empty, that is, there 
exists at least one x € R” such that f(a) < co. 
Now, let us define a convex function. 


Definition 4.3 (convex function). Let f : R” — R U {00} be a proper function. 
The function f is said to be a convex function if the following inequality 


f(te@+(U—ty) <tf@+d-nfy) (4.5) 
holds for any vectors x, y € dom( f) and for any real number t € [0, 1]. 


Figure 4.2 illustrates a convex function and a non-convex function, a function that is 
not convex. By definition, if f is convex, the line segment between any two points 
(a, f(x)) and (y, f(y)), where x, y € dom(f), lies above or on the graph of f. 
On the other hand, if f is non-convex, there exists a line segment that partially lies 
below the graph. 


Exercise 4.4. Suppose that f and g are real-valued convex functions. Prove that 
f + is also convex. 


One more important property of a function is closedness. A function f : R > 
R U {oo} is said to be closed if the sublevel set (or lower level set) {x € dom(f) : 
f(a) < c} is a closed set for any c e R. The closedness of a function is also 
understood by its epigraph. The epigraph epi( f) of function f is defined by 


epi(f) = {(a, t) € R” x R: x e dom(f), f(x) < t}. (4.6) 


Figure 4.3 illustrates the epigraph of a function f. The epigraph of f is the region 
above the graph on its effective domain. It is easily shown that a function f is closed 
if and only if its epigraph is closed. 

We can also perceive other properties of a function in terms of its epigraph. A 
function is convex if and only if its epigraph is convex. A function is proper if and 
only if its epigraph is non-empty. We summarize these facts in Table 4.1. 
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Figure 4.3. Epigraph epi(f) of function f. 


Table 4.1. Function and its epigraph. 


Function f Epigraph Epi(/) 
convex convex set 
closed closed set 
proper non-empty set 


Now, we formulate a convex optimization problem in a general form. 


Convex optimization problem 


Let f : R” > RU {oo} bea proper, closed, and convex function, and C C R” 
be a closed convex set. Then, a convex optimization problem is a problem to find 
a vector £* € R” that minimizes the function f(a) over the set C C R”. 


For the convex optimization, we use the following terminology in this book: 


e The function f (æ) is called a cost function or an objective function. 
e The set C is called a constraint set or a feasible set. 

e The entries of C are called feasible solutions. 

e The inclusion æ € C is called a constraint 


The above optimization problem is often described as follows: 


minimize f(x) subject to æ e cC. (4.7) 
axel" 


In this expression, the optimization variable æ € R” to be minimized is placed 
under “minimize,” next to which the cost function f(a) is placed. The term “sub- 
ject to” is sometimes abbreviated as “s.t.,” followed by the constraint æ € C. The 
term “minimize” in (4.7) is often abbreviated as “min” and simply described as 


min f(a) st. x EC. (4.8) 


xeR” 
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minimize f(x) subject to æ E€ C 
xER” Se b 
cost function constraint 
min f(x) — minimum value 
BEC 


arg min f(a) minimizer (set) 
xzEC A 


Figure 4.4. Notation for optimization problem. 


Also, we often write the constraint under “minimize” as 
min f (a). (4.9) 
xreC 


Note that (4.9) usually means the minimum value of the optimization problem 
(4.7), instead of an optimization problem. The set of minimizers (solutions) of the 
optimization problem (4.7) is denoted using “arg” (abbreviation of argument) as 


arg min f (æ) = {x* e C: f(a*) < f(x), Yæ eCNdom(f)}. (4.10) 


zec 


Also, we often use the following expression 


x* = arg min f (æ). (4.11) 
xec 
In this expression, “argmin” returns a minimizer, instead of the set of minimizers. 
If the minimizer of the optimization problem (4.7) is unique, then this expression 
may not cause any confusion. If not unique, (4.11) means that a* is a minimizer 
arbitrarily taken from the set of minimizers. 
We summarize the definitions in Figure 4.4. 
Then we define a local minimizer and a global minimizer of the optimization 
problem (4.7). If there exists an open set B that contains a feasible solution % € 


C N dom(f) such that 
f(a) > f(@), YxeBNC, (4.12) 


then f (®) is called a local minimizer of the optimization problem (4.7). Ifa feasible 
solution a* € C satisfies 


f(x) > f(a), Varec, (4.13) 


then 2* is called a global minimizer of the optimization problem (4.7). 
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f(a) 


T = a" 
Figure 4.5. Local minimizer z and global minimizer æ* with convex function (left) 
and non-convex function (right). 


One of the most important properties of convex optimization is that a local 
minimizer is (if it exists) a global minimizer. Figure 4.5 illustrates this fact of convex 
optimization. In this figure, for a convex function, the local minimizer is also the 
global minimizer æ*. On the other hand, for a non-convex function, they may not 
coincide. In fact, the following theorem holds [13, Section 4.2.2]. 


Theorem 4.5. For a convex optimization problem (4.7), any local minimizer is (if it 
exists) a global minimizer, and the set of global minimizers is a convex set. 


By this theorem, an algorithm that outputs a local minimizer of a convex opti- 
mization problem is automatically an algorithm for a global minimizer. For exam- 
ple, a convex optimization problem with a differentiable and convex function f (x) 
and C = R” (unconstrained problem), a point £ such that V f(a) = 0, where V f 
is the gradient of f , is a local minimizer, and this is also a global minimizer. There- 
fore, for an unconstrained convex optimization with a differentiable cost function, 
an algorithm searching for a point satisfying V f(a) = O is an algorithm for a 
global minimizer. This idea is very important to derive an efficient algorithm for 


convex optimization. 
Exercise 4.6. Find a convex function that has no local minimizer. 
Exercise 4.7. Find a convex function that has infinitely many local minimizers. 


Next, we consider the uniqueness of the minimizer. For this, we define strictly 
and strongly convex functions. 


Definition 4.8. Let f : R” —> RU {00} be a proper function. The function f is 
said to be a strictly convex function if for any x,y € dom(f) C R” with £ # y 
and anyt € (0, 1), 


f(te+(U— ny) <tf(w) + -Aflu (4.14) 
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Moreover, the function f is said to be a strongly convex function if there exists p > 0 
such that for any x, y € dom( f) C R” and anyt € [0, 1], 


f(te+(U— ty) <tf@)+d-nfy)-td- ae -yll (4.15) 
The constant p is called a modulus. 

The following lemma is an important property of strongly convex functions. 
Lemma 4.9. A function f : R” > RU {oo} és strongly convex with modulus p > 0 
if and only if 
-12 (4.16) 


is convex. 


Weierstrass extreme value theorem is also important in convex optimization. 


Theorem 4.10 (Weierstrass extreme value theorem). Every continuous function 
on a compact set attains its extreme values on that set. 


Note that a subset in R” is compact if and only if it is closed and bounded. 
The following theorem shows the existence and uniqueness of the minimizer of 
a strongly convex function. 


Theorem 4.11. Assume f : R” — RU {00} is a proper, closed, and strongly convex 
function with modulus p > 0. Then f has the unique minimizer x* € dom(f). 
That is, for alla € dom(f) such that x # x”, 


f(a) > f(a"). (4.17) 


Moreover, for any x € dom( f), we have 


B 


f(a) > f(a*)+ 5 læ = æ* l3. (4.18) 


This theorem is used to define the proximal operator discussed in the next section. 


Exercise 4.12. Prove Theorem 4.11. 


4.2 Proximal Operator 


We here introduce a powerful tool called the proximal operator for deriving efficient 
algorithms to solve convex optimization problems. 
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4.21 Definition 


The proximal operator of a function is defined as follows: 


Definition 4.13 (proximal operator). For a proper, closed, and convex function 
f : R” > RU {oo}, and a real number y > 0, the proximal operator prox, p with 
parameter y is defined by 


1 
prox, s (v) £ arg min [ræ +e- o}. (4.19) 
zedom(f) 2y 


First, we can easily show that the function 
A 1 2 
g(x) = f(x) + z" — v|l5 (4.20) 


is a proper, closed, and strongly convex function with modulus 6 = 1/y (see 
Definition 4.8). Therefore, from Theorem 4.11, the proximal operator (4.19) is 
well-defined, that is, prox, ¢(v) uniquely exists for any v € R”. 


Exercise 4.14. Assume that f is a proper, closed, and convex function, y > 0, 
and v € R”. Prove that the function g(a) in (4.20) is proper, closed, and strongly 
convex function with modulus $ = 1/y. 


From (4.19), if we take y — oo, then the second term of (4.19) disappears and 
the proximal operator becomes 


prox, ¢(v) = argmin f(x) = z“ (4.21) 
xedom(f) 


where æ* is a minimizer of f(a). On the other hand, taking y — 0 eliminates 
the first term of (4.19), and the proximal operator is reduced to 


Proxy s (v) = argmin |ja — vli? = Ilew), C êdom(f), (4.22) 
xedom(f) 


where Ic is the projection operator on the set C. That is, Ie returns the closest point 
in C measured by the £? norm. Finally, if the parameter y satisfies0 < y < 00, the 
proximal operator (4.19) is a mixture of the minimizer in (4.21) and the projection 
operator in (4.22). 

Figure 4.6 illustrates the proximal operator. By definition, if a point v is outside 
the effective domain dom( f), then prox, p (v) moves into dom(f). Ifa point v is in 
dom(f), then prox, (v) moves in dom( f), and approaches towards the minimizer 
x* of f (x), with a step size determined by the value of y . Therefore, the effective 
domain dom(f) is an invariant set under the proximal operator prox, p. Note that 
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Figure 4.6. Illustration of proximal operator. 


a set C is called an invariant set under an operator T if 
xzeC = T(aeC (4.23) 
holds. 


Exercise 4.15. Prove that the effective domain dom( f) is an invariant set under 
the proximal operator prox, p- 


As illustrated in Figure 4.6, if v € dom(f), then the vector prox, (v) approaches 
towards the minimizer x* in the effective domain dom( f). That is, a proximal 
operator works like a negative gradient in the effective domain. Note that the gradi- 
ent cannot be defined for non-differentiable functions, while the proximal operator 
has no such restriction. 


4.2.2 Proximal Algorithm 


From the invariance property of (4.23), we can consider an iterative algorithm 
called the proximal algorithm for a minimizer x* of function f: 


Proximal algorithm 


Initialization: give an initial vector x[0] and positive numbers yo, y1, ... 
Iteration: fork = 0,1,2,..., do 


z[k + 1] = prox,, ¢(a[k]). (4.24) 


If you properly choose the parameter sequence {yx}, you can obtain one of the min- 
imizers of f by the proximal algorithm. The convergence is shown in the following 
theorem [7, Proposition 5.1.3]: 


Proximal Operator 63 


Theorem 4.16 (convergence of proximal algorithm). Suppose that the parameter 
sequence {yk} satisfies yk > 0 for allk and 


> ve = 00. (4.25) 
k=0 


Then, the vector sequence {x[k]} generated by the proximal algorithm (4.24) converges 
to one of the minimizers of f for any initial vector x[0]. 


The theorem is based on the fact that a minimizer of f (æ) is also a fixed point 
of its proximal operator prox, p. Note that a fixed point of prox, p is a point that 
satisfies 


x = prox, (a). (4.26) 


A fixed point is literally fixed under the operation by prox, p. 
The proximal algorithm minimizes the strongly convex function 


1 
g(x) = f(x) + z-le — z[k] (4.27) 
Yk 


at step k. In other words, the algorithm approximates a general convex function 
f(x) by a strongly convex function at each step. 

Also, it is often important to find a closed form of the proximal operator for an 
efficient algorithm. A function for which the proximal operator (4.19) is obtained 
in a closed form is sometimes called proximable. Let us see some proximable func- 
tions in the following subsections. 


4.2.53 Proximal Operator for Quadratic Function 


Let us consider the following quadratic function 
1 
f(£)= zu! be ~y'a, (4.28) 


where Ọ is a positive-definite symmetric matrix. Note that a symmetric matrix ® is 
said to be positive definite if the following inequality holds 


x! Ox > 0, (4.29) 


for every nonzero vector « € R”. Let us compute the proximal operator of the 
quadratic function in (4.28). From the definition of the proximal operator in 
(4.19), we have 


1 1 
prox, -(v) = arg min [zeton — ylz + TA — v)! (x — v] . (4.30) 


xeR” 
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Since the function in (4.30) is differentiable, we can obtain the minimizer by setting 
the gradient to be zero. After some calculations, we have the proximal operator in 
a closed form: 


i= 1 
prox, ¢(v) = (© + 1) (v + Lo) : (4.31) 
y 7 
Exercise 4.17. Prove that equation (4.31) holds. 


An important application of this proximal operator is numerical matrix inver- 
sion. The minimizer x* of (4.28) is also the unique solution of linear equation 


Or = y, (4.32) 


that is, 2* = © ly. Note that ® is invertible since ® is positive definite. Then, let 
us assume that the condition number (the ratio of maximum and minimum eigen- 
values of ®, i.e., Amax(®)/Amin(®)) is very large so that the numerical computation 
of the inverse is difficult. We call such a case ill-conditioned. For an ill-conditioned 
case, the proximal algorithm (4.24) is used to safely compute the inverse. From 
(4.31), the proximal algorithm to obtain the minimizer of (4.28), which is also the 
solution of (4.32), is given as follows: 


Proximal algorithm for “ty 


Initialization: give an initial vector æ[0] and a positive number y > 0 
Iteration: fork = 0,1,2,..., do 


1.\7! 1 
rik +1) = (© + ~1) (v + Lalki) . (4.33) 


If the positive number y is sufficiently small, then the condition number of matrix 
® + (1/y JI is relatively small, and the inversion can be easily computed numeri- 
cally. 


Also, if y is sufficiently small, we have 
1 =] 
(o+—1) =y (+y! + yy -— y0). (4.34) 
Then, the right-hand side of the proximal operator (4.31) becomes 


1 \7! 1 1 
(o+-1) (v+) ~ra -7(v+ 2») 
y y y 


x v — y (Dv — y) 
=v—yVf(v). 


(4.35) 
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C 


Figure 4.7. Indicator function Ze(x) on a closed interval C e R. 


That is, if y is sufficiently small, the proximal algorithm (4.33) is approximately 
the gradient descent algorithm 


alk +1] = alk] —yVf(alk), &=0,1,2,... (4.36) 
to find the minimizer of the quadratic function (4.28). 


4.2.4 Proximal Operator for Indicator Functions 


The indicator function of a non-empty set C C R” is defined by 


Tela) ê 0, if vec, (4.37) 
oo, if a ¢C. 


If the set C is non-empty, closed, and convex, then the indicator function I¢ (x) 
is a proper, closed, and convex function (to check this, draw the epigraph). For 
example, the indicator function I¢(x) of a closed interval C on R is illustrated in 
Figure 4.7. You can see that if C is a non-empty closed interval, the epigraph is a 
non-empty, closed, and convex set. 

Let us compute the proximal operator of the indicator function Jc. From the 
definition (4.19), the proximal operator of Ic is given by 


; 1 
prox, 7, (v) = arg min [ræ + z |” — vi 


xeR” 


= arg min ||æ — vll? (4.38) 
xeC 


= IIe(v). 
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That is, the proximal operator of the indicator function Ic is the projection operator 
IIc onto the set C. 


Exercise 4.18. Suppose that C C R” is a non-empty, closed, and convex set. Prove 
that II¢(v) is uniquely determined for any v € R”. 


4.2.5 Proximal Operator for £! Norm 


Let us compute the proximal operator (4.19) for the £! norm: 
f(a) = lela = Do lai, (4.39) 
i=l 


where x; is the i-th element of x € R”. From the definition of the proximal oper- 
ator, we have 


, 1 2 
prox, y. (0) = argmin {le + 5 tee ~ vi] 
Yih pa By 


(4.40) 


xeR” 


n 
1 
= argmin >» {us + Dy WF = of. 
i=l 


where v; is the i-th element of v. This optimization can be reduced to element-wise 
optimization, that is, 


n n 
. 1 , 1 
-Paa [tal + ger?) 


(4.41) 
Therefore, we just solve the following scalar minimization problem: 
dts 1 2 
minimize |x| + —(x —v)°. (4.42) 
xeR 2y 


The minimizer x* € R can be easily calculated, which is given by 


v— y, ifv>y, 
x* = S, œ) Ê 0, if =y <v <y, (4.43) 
v+y, ifo<-y. 
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Figure 4.8. Soft-thresholding operator S, (v). 


The function S, (v) in (4.43) is called the soft-thresholding operator. Figure 4.8 
shows the graph of the soft-thresholding operator. 


Exercise 4.19. Show that the minimizer x* of the function 
1 
fœ) Ê |x] + aye v)? (4.44) 


is given by (4.43). (Hint: divide the domain of f (x) in two intervals: x > 0 and 
x < 0. Then, consider the three cases for v: 0 > y, —y <v <y,andv < —y). 


By using the scalar-valued soft-thresholding operator, the proximal operator of 
the £! norm is given by 


[prox, -(v)]; = Sy (vi), (4.45) 


where [ ]; denotes the i-th element of the vector in the square bracket. For a 
simple expression, we extend the definition of the scalar-valued soft-thresholding 
operator (4.43) to vectors. For a vector v € R”, we define the vector-valued soft- 
thresholding operator $, (v) by 


[S, (v)]i = Sy i), (4.46) 


where [S, (v)]; is the i-th element of S, (v). With this notation, the proximal 
operator of the £! norm (4.45) is simply rewritten as 


PLOX, Jh (wv) = Sy (w). (4.47) 


Exercise 4.20. Let Q € R”*” be an orthogonal matrix. Prove that the minimizer 
x* €e R” of the following function 


a l 
f(x) = slow — ylz + Alla (4.48) 
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tH (v) J 


Figure 4.9. Hard-thresholding operator H; (v). 


is given by 
a* = S$,(Q'y). (4.49) 


Note that Q is orthogonal if and only if 


oQ' =0'0 =]. (4.50) 


In summary, the proximal operator of the £! norm is the soft-thresholding oper- 
ator S, (v). If the absolute value of an element v; in v is less than y, then the 
element is set to be zero by the proximal operator. This is an important property 
to understand why £! optimization gives a sparse solution. 

The word ‘soft means that the operator is continuous (see Figure 4.8). We can 
also define the hard-thresholding operator by 


v, if lol >A, 
Ho) ê 4.51 
a0) i eile: oon) 


Figure 4.9 shows the graph of the hard thresholding operator. We can see from 
this figure, the hard-thresholding operator is discontinuous. An interesting fact 
is that the hard-thresholding operator is the proximal operator of the £? norm 
with A = ./2y. Strictly speaking, this is dubious since the proximal opera- 
tor is defined for proper, closed, and convex functions (see Definition 4.13), but 
the £° norm is not convex. However, the hard-thresholding operator is very use- 
ful to derive efficient algorithms for €°-norm optimization. See Chapter 5 for 
details. 


Exercise 4.21. Compute the proximal operator (4.19) of the £? norm, and show 
that it is the hard-thresholding operator (4.51). 
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4.3 Proximal Splitting Methods for £! Optimization 


In this section, we derive an efficient algorithm based on proximal splitting to solve 
the £! optimization: 


minimize ||æ||ı subjectto a = y, (4.52) 

xeR” 
where ® € R”*”” and y € R” are given. We assume thatm < n and ® has full row 
rank, that is, rank(®) = m. The cost function is the £! norm, which is obviously a 


proper, closed, and convex function. Then, let us consider the constraint. Denote 
by C the set of vectors x € R” satisfying the constraint Da = y. That is, 


cs {a e R” : Ox = y}. (4.53) 
It is easy to prove that this set is a non-empty, closed, and convex set in R”. 


Exercise 4.22. Show the cost function ||æ]|ı in (4.52) is a proper, closed, and 
convex function. Also, show the set C defined in (4.53) is a non-empty, closed, and 
convex set in R”. 


Then, consider the indicator function Te (æ) for C: 


Ie(x) = Oe (4.54) 
œ, if Orfy. 


By using this, the optimization problem in (4.52) is equivalently rewritten as 


minimize læli + Ie(a). (4.55) 
me n 


Note that the functions ||æ||ı and J¢(a) are both proper, closed, and convex func- 
tions, and hence the sum of them, ||æ||1 + Ze (æ), is also proper, closed, and convex. 


Exercise 4.23. Suppose that two functions, f and g, are proper, closed, and con- 
vex. Suppose also that dom( f) N dom(g) # Ø. Prove that f + g is also a proper, 


closed, and convex function. 


Note that for the optimization problem (4.55), we cannot obtain the proximal 
operator of the cost function 


f(x) = æli + Ie), (4.56) 


in a closed form. In other words, f(a) is not proximable. That is, we can- 
not directly apply the proximal algorithm (4.25) to this problem. However, the 
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proximal operators of the two functions 
fie) = lelin Hæ) S res) (4.57) 


can be obtained as the soft-thresholding operator in (4.47) and the projection oper- 
ator onto C defined in (4.38), respectively. The idea is to split the cost function as 
f = fi + f2, and write an algorithm using the proximal operators of f and fo 
separately. Algorithms designed by this idea are called proximal splitting algorithms. 
For the problem (4.55), we introduce two proximal splitting algorithms. 


4.3.1 Douglas-Rachford Splitting Algorithm 
Let us consider the following optimization problem in a general form: 


minimize fi(x) + f(x), (4.58) 


where fı and f> are proper, closed, and convex functions. The Douglas-Rachford 
splitting algorithm for (4.58) is given as follows: 
Douglas-Rachford splitting algorithm for (4.58) 


Initialization: give an initial vector z[0] and a parameter y > 0 
Iteration: fork = 0,1,2,...do 


[k +1] = [k] 
£ prox, f (z[k]) rer 
z[k + 1] = z[k] + prox 


„p elk + 1] — z[k]) — alk + 1] 


From the algorithm, we can derive an algorithm for our unconstrained problem 
(4.55). In our case, fi(a) = ||a||1 and fo(a) = Ice(æ), for which the proximal 


operators are given by 


prox, f, (v) = $, w), prox, f (v) = Ie(v). (4.60) 
Then the Douglas-Rachford splitting algorithm for the £! optimization problem 
(4.52) is given as follows: 
Douglas-Rachford splitting algorithm for £! optimization problem (4.52) 


Initialization: give an initial vector z[0] and a parameter y > O 
Iteration: fork = 0,1,2,...do 


z[k + 1] = $, (z[k]) 


(4.61) 
z[k + 1] = z[k] + Uc(2a2[k + 1] — z[k]) — æ[k + 1] 
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In this algorithm, the projection operator Ie on the linear subspace C defined in 
(4.53) is given by 


Ie(v) =v + 0! (001)! (y — bv). (4.62) 
Note that POT is invertible since has full row rank. 


Exercise 4.24. Show that the projection operator IIc for C defined in (4.53) is 
given by (4.62). 


The £! optimization problem (4.52) can be rewritten as a linear programming 
problem, which can be efficiently solved by the well-known interior-point method 
[47, Section 5.12.]. However, this method should solve a system of linear equations 
at each step of the iteration, which takes in general non-negligible computational 
time. On the other hand, the Douglas-Rachford algorithm in (4.61) only requires 


e simple continuous mapping of the soft-thresholding function Sy, 
e linear computation of matrix-vector multiplication and vector addition. 


Thus, the Douglas-Rachford algorithm is efficient and easy to implement compared 
to standard interior-point algorithms. 

To consider the convergence of Douglas-Rachford splitting algorithm, we define 
the relative interior ri(C) of a subset C C R” by 


ri(C) = {x e R” :x EC and Je > 0, N(x) Naff(C) c C}, (4.63) 
where Me (a) is the €-neighborhood of æ, that is, 
N: £ {w e R": |lu—2ll2 < €}, (4.64) 


and aff (C) is the affine hull of C, that is, the set of all affine sets containing C. Note 
that the relative interior is different from the interior of C that is defined by 


int(C) = {a e R” : x eC and Je > 0, N(x) CC}. (4.65) 
For example, let us consider the disk 
C = {(x1, x2, 0) E R? : x? + x2 < 1}, (4.66) 


on the x1-x2 plane in RÌ. Then, the interior of C is empty by definition, while the 
relative interior is 


ri(C) = {(41, x2,0) € R? : x? + x2 < 1}. (4.67) 


Now, we introduce the convergence theorem [25] for Douglas-Rachford split- 
ting algorithm. 
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Theorem 4.25. Suppose that fı and fù are proper, closed, and convex functions that 


satisfy 
ri(dom(fi)) N ri(dom(f2)) # 9. (4.68) 

Ao suppose that 
fi(#) + hæ) > 00 as ||ax\l2 > æ. (4.69) 


Then each sequence {x[k]}} o generated by Douglas-Rachford splitting algorithm con- 
verges to a solution of the optimization problem (4.58). 


4.3.2 Dykstra-like Splitting Algorithm 


Here we consider the following optimization problem: 
—— l 2 
minimize fı(æ) + fo(x) + =æ — vll, (4.70) 
xeR” 2 


where fı and fz are proper, closed, and convex functions. We apply Douglas- 
Rachford splitting algorithm to this optimization by splitting the cost function into 
fiand f2 + HI - —v ||. Then an algorithm called Dykstra-like splitting algorithm 
is obtained as follows. 


Dykstra-like splitting algorithm for (4.70) 


Initialization: set x[0] = v and p[0] = q[0] = 0; give a parameter y > 0 
Iteration: fork = 0, 1,2, ... do 


z[k + 1] = prox, s, (Œ[k] + pIk]) 
pik + 1] = z[k] + plk] — z[k] 
x[k + 1] = prox, s (z[k] + q[k]) 


g{k + 1] = z[k] + q[k] — æ[k + 1] 


An important application of Dykstra-like algorithm is to find the projection 


of a point onto the intersection of two convex sets Cy and C2, namely to find 
I¢e,nc,(v). This is done by setting fı = Ic, and f2 = Ic,, indicator functions 
defined in (4.37), for the optimization problem (4.70). Since PLOXy Jo, = IIc, and 
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POX, Jo, = IIc,, the algorithm is given by 


z[k + 1] = IIc, (æ[k] + pIk]) 

pik + 1] = æ[k] + plk] — z[k] 
z[k + 1] = Uc, (z[k] + q[k]) 

qik + 1] = z[k] + q[k] — x[k + 1] 


(4.72) 


This is called Dykstra’ projection algorithm, proposed by Dykstra [14].' The name 
“Dykstra-like splitting” is actually after this algorithm. The convergence theorem 
is given as follows [25]. 


Theorem 4.26. Suppose that fı and fo are proper, closed, and convex functions that 


satisfy 
dom( f1) O dom(f2) Æ Ø. (4.73) 


Then each sequence {x[k]} o generated by Dykstra-like splitting algorithm converges 
to a solution of the optimization problem (4.70). 


Compared to the assumptions in Theorem 4.25 for Douglas-Rachford splitting 
algorithm, the assumption (4.73) is weaker. 


4.4 Proximal Gradient Method for £! Regularization 


We here consider an efficient algorithm for £ l regularization (or LASSO): 
1 
minimize =||®a — yl|5 + Allallr. (4.74) 
xeR” 2 
We assume that ® e R”*", y e R”, and å > O are already given. 


4.41 Algorithm 


In (4.74), the first term A Ox — yl and the second term A||a||1 are both proper, 
closed, and convex functions of x. Also, the proximal operator of the first term, a 
quadratic function of æ, is obtained in a closed form as described in Section 4.2.3 
(see also Exercise 4.27 below). Hence, we can directly apply the Douglas-Rachford 
splitting algorithm (4.59) to this problem. 


1. Note that Dykstra for this algorithm is different from Dijkstra who found a famous algorithm for a shortest 
path in a network. 
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Exercise 4.27. Show that the proximal operator of f(x) = ibr — yll is 
given by 


ee ee 1 
prox, (v) = (o + 21) (oy + Lo) . (4.75) 


As we have seen before, the proximal operator is an “alternative” to the gradient 
descent update as shown in (4.35). However, the first term LI Ox — yl of (4.74) 
is a quadratic function of æ, which is differentiable. So, we can directly benefit from 
the gradient of 5 ||®a-—y i itself for an algorithm instead of its proximal operator. 
We here introduce an efficient algorithm using the gradient, which can be much 
faster than the Douglas-Rachford splitting algorithm, along with an acceleration 
method. 

First, let us consider the following general problem: 


minimize fila) + f(x), (4.76) 


where fi is a differentiable and convex function satisfying dom( f1) = R”, and fz is 
a proper, closed, and convex function. Note that fz is not necessarily differentiable, 
for example an indicator function as in (4.37). 

For the optimization problem, we introduce the proximal gradient algorithm, 
which is given as follows: 


Proximal gradient algorithm for (4.76) 


Initialization: give an initial vector x[0] and a real number y > 0 
Iteration: fork = 0,1,2,...do 


a[k + 1] = prox, - (xlk] — y V fı (Œ[k])). (4.77) 


In this algorithm, y > 0 is the step size of the update. The function V f(a) is the 
gradient of fı atx e R”. 
We offer a geometrical interpretation of the proximal gradient algorithm. Let us 


define 


p(x) = prox, p (a — y V fi(@)). (4.78) 


Then, from the definition of the proximal operator in (4.19), we have 


1 
(x) = argmin| A +5, le- -rY Ae) | 
zeR” ; i (4.79) 
= agmin| file x) + f2(z) + z |7 = ei], 


zeR” 
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fil2) 


Figure 4.10. Linear approximation Ĵi (z; x) of convex function fi (z) at x. 


where 
file; x) ê fila) + VAi(a) "(2 — x). (4.80) 


Note that ||V fi (a) ||5 and f}(a) are constant for the minimization with z, 
and hence ||V fi (a) ||5 is eliminated and fı (æ) is added in (4.79). The func- 
tion fı(z; æ) is a linear approximation of f(z) around the point œ € R”. 
Figure 4.10 shows an example of the linear approximation for one dimensional 
case. From (4.79), the function ø (æ) is the proximal operator of the linearized 
function fi (z; £) plus fo(z), and the iteration (4.77) can be interpreted as the 
proximal algorithm (4.24) for this approximated function. 


4.4.2 Convergence Analysis 


Here we analyze the convergence of the proximal gradient algorithm. For this, we 
define Lipschitz continuity. A function f : R” > R” is said to be Lipschitz con- 
tinuous over R” if there exists a constant L > 0 such that the following inequality 


Il f(@) — F(yll2 < Lila — yll2 (4.81) 


holds for any vectors æ, y € R”. When f is Lipschitz continuous, L is called a 
Lipschitz constant, and the smallest L that satisfies (4.81) is called the best Lipschitz 
constant. 

Let us consider the optimization problem (4.76). We assume that the gradient 
V fi of fi is Lipschitz continuous, that is, there exists L > 0 such that 


IV fi(z)-VAi@ylle < Lile—yll2, Væ, y eR” (4.82) 


holds. Assume that the optimization problem (4.76) has an optimal solution æ*. 
Then we have [92, Section 4.2] 


a” = p (x*) = prox, p (æ* — y V fi(w")). (4.83) 
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This implies that an optimal solution æ* of (4.76) is also a fixed point of map- 
ping ¢ in (4.78). From this, the meaning of the iteration (4.77) is now clear; this 
algorithm seeks the fixed point of ¢. 


Exercise 4.28. Consider a continuous function ¢ : R” +> R”. Assume that there 
exists an initial vector x[0] € IR” such that the iteration 


z[k +1] = ¢(z[k]), k=0,1,2,... (4.84) 


converges to %* € R”. Prove that æ* is a fixed point of mapping ¢, that is, %* = 


ġ(x*) holds. 
In fact, the following theorem holds [5]. 


Theorem 4.29. Assume the gradient V fı is Lipschitz continuous over R", and L is 
a Lipschitz constant satisfying (4.82). Assume also that the step size y satisfies 


1 
Z, 4. 
ys T (4.85) 


Then the sequence {x[k]} generated by the proximal gradient algorithm (4.77) con- 
verges to a solution a* of (4.76), and we have 


le[k +1] — ax" |lo < æ[k]— æ*l2, &=0,1,2,... (4.86) 
Moreover, we have 


L\\|x[0] — x*|\5 


k=0,1,2,..., 4.8 
2k ’ a 49> ( 7) 


f (a[k]) — f(@") < 


where f(x) = fi(x) + f(s). 


By this theorem, the convergence rate of the proximal gradient algorithm is 
O(1/k). Note that this rate is much slower than /inear convergence (or first-order 
convergence), with which the rate is O(r*) with |r| < 1. 

Now, let us derive the proximal gradient algorithm of our £! regularization 
(4.74). In our case, the two functions are 


1 
fie) = 5dr- yl fŒ) = Alle, (4.88) 
and the gradient of fı (a) is given by 


V fi(z) = D! (Ox — y). (4.89) 


Proximal Gradient Method for £! Regularization 77 
Also, the proximal operator of fa (æ) = Alla||; is the soft-thresholding operator 
(see Section 4.2.5): 

Prox, f (v) = $10). (4.90) 


From these, the proximal gradient algorithm for (4.74) is given as follows. 
Proximal gradient algorithm (ISTA) for (4.74) 


Initialization: give an initial vector æ[0] and parameter y > 0 
Iteration: fork = 0, 1,2, ... do 


ak + 1] = S, 1 (x[k] — y ®' (@a[k] — y)). (4.91) 


This algorithm is called the iterative shrinkage thresholding algorithm, or ISTA for 
short. From (4.89), a Lipschitz constant of V fi is 


L = Amax(®' D) = Omax(®)* = || Ol’, (4.92) 


where Amax and Omax respectively denote the maximum eigenvalue and the maxi- 
mum singular value, and ||®|| is a matrix norm defined by Omax(®). Note that if 
® + 0, then ||®|| > 0. From (4.85) in Theorem 4.29, if we choose y to satisfy 


1 
O<y< IoP (4.93) 


then a solution of the £! regularization (4.74) is obtained after the simple iteration 
of (4.91). 

Theorem 4.29 implies that the error by ISTA decreases at the rate of O(1/k). 
We can then accelerate the algorithm by using not only a[k] but also the previous 
a[k — 1] in the k-th step. The following algorithm is the accelerated iteration called 
FISTA (Fast ISTA), which converges at the rate of O(1/k?) [5, 118]. 


Fast ISTA (FISTA) for (4.74) 


Initialization: give initial vectors x[0], z[0], initial number t[0], and param- 
etery > 0 
Iteration: fork = 0,1,2,...do 

alk + 1] = S,,(z[k] — y ®' (@z[k] — y)), 


1+ J1 + 4t[k]2 
2 > 


el ei ane) 
kal] x x[k]). 


tik +1] = 


(4.94) 


zik + 1] = æ[k + 1] + 
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It is surprising that such simple modification leads to improvement of compu- 
tational efficiency from O(1/k) to (1/k*). However, it is known that O(1/k7) is 
optimal and one cannot accelerate the algorithm any further [92, Section 4.3]. 


4.5 Generalized LASSO and ADMM 


In this section, we consider an extension of £! regularization, with a generalized 
regularization term: 


1 
minimize ~||®a — yll} + 4l ¥æll1, (4.95) 
xeR” 2 


where ¥ is a matrix. We call this optimization problem the generalized LASSO. If Y 
is the identity matrix, this problem is reduced to the £! regularization in (4.74). A 
problem is that the regularization term ||‘¥ ||; is in general not proximable, that 
is, it is difficult to obtain a closed form of the proximal operator of ||‘? ||. There- 
fore, we do not directly apply Douglas-Rachford splitting nor the proximal gradi- 
ent method to this problem. In this section, we introduce an alternative splitting 
method for this case. 


4.51 Algorithm 


Aside from the generalized LASSO in (4.95), let us consider a general optimization 
problem: 


minimize fı(æ)+ f2(z) subject to z = Pa, (4.96) 
xeR”,zeRP 
where fı : R” > RU{oo}and fo : RP —> RU{00} are proper, closed, and convex 
functions, and Y € R? *”. The following algorithm, called Alternating Direction 
Method of Multipliers, or ADMM for short, is an efficient algorithm to solve (4.96): 
ADMM for (4.96) 


Initialization: give initial vectors z[0], v[0] € R”, and real number y > 0 
Iteration: fork = 0,1,2,... do 


1 
alk + 1] := argmin [næ B | Wx — z[k] + vik] | , (4.97) 


xeR” 


z[k + 1] := prox, p (Walk + 1] + v[k]), (4.98) 
v[k + 1] := v[k] + Balk + 1] — z[k + 1]. (4.99) 
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To analyze this algorithm, we introduce the augmented Lagrangian: 
L p(w, z, X) = file) + fle) +A" Yæ- 2) + Fx — 23, (4.100) 


where A is a Lagrange multiplier and p is a positive constant. The term “aug- 
mented” means that the function (4.100) is augmented from the usual Lagrangian 


L(x, z, A) = fi) + fo(z) +A! (Wa — 2). (4.101) 


by adding the term Fla — Bila: Note that the augmented Lagrangian becomes 
strongly convex with respect to variables æ and z thanks to the additional term 
E\|Wa — z||5 if YTY is positive definite. 

Now, let y © p`! and v[k] = yAfk]. Then the ADMM algorithm 
(4.97)-(4.99) can be rewritten in terms of augmented Lagrangian as 


x[k + 1] = argmin Lp (a, z[k], A[k]) (4.102) 
xeR” 

z[k + 1] = arg min Lp (æ[k + 1], z, A[k]) (4.103) 
zeRP 

A[k + 1] = A[k] + p (¥ z[k + 1] — z[k + 1], k=0,1,2,... (4.104) 


Exercise 4.30. Show that the algorithm in (4.102)—(4.104) is equivalent to 
(4.97)-(4.99) under the transformation y = ors v[k] = y A[k]. 


The important point of this algorithm is that the optimization for variables æ, 
z, and A is decoupled. The first step (4.102) is minimization of the augmented 
Lagrangian for x with fixed z and A. The second step (4.103) is for z with fixed 
x and A. The third step (4.103) is update for A using obtained æ and z. 

The following is a convergence theorem for the ADMM algorithm [12, 35]: 


Theorem 4.31 (Convergence of ADMM). Consider the optimization problem in 
(4.96). Assume that fı and fù are proper, closed, and convex functions. Assume also 
that the Lagrangian (4.101) has a saddle point, that is, there exist x*, z*, and X* such 
that 


L(a*, z*, A) < L(a*, z*, A*) < L(x, z, A*), Va, z, AÀ. (4.105) 


Then, the ADMM algorithm (4.97)—(4.99) satisfies the following convergence proper- 


ties: 
e The residual 


rik] = Balk] — z[k], k=0,1,2,... (4.106) 
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converges to O ask —> œ. This implies that the iterates converges to a feasible 
solution of (4.96). 
© The objective value f\(a[k]) + f2(z[k]) converges to the optimal value 


fs 1 fi(@) + fol). (4.107) 


Yu=z 


e if PITY js positive definite, then the sequence {(x[k], z[k])} converges to an 
optimal solution (a* , z*) of the optimization problem (4.96). 


We can now derive the ADMM algorithm for the generalized LASSO (4.95). 


First, since fj (a) = 5||Oa — yll, the minimization in (4.97) becomes 


1 1 
argmin | Se ~ yl + z Yæ — z[k] + otk113} 
xeR” 2 2y 
F lutw\ [aT l yT 
— (o ®+—Y¥ v) (o y+—Y¥ (z[k]— v). 
y y 
(4.108) 


Exercise 4.32. Prove the equality in (4.108). 


Next, since fo({x) = Allæ]|ı, the proximal operator in (4.98) is the soft- 
thresholding function. In summary, the ADMM algorithm for the generalized 
LASSO (4.95) is given as follows. 


ADMM for generalized LASSO (4.95) 


Initialization: give initial vectors z[0], v[0] € R”, and real number y > 0 
Iteration: for k = 0, 1,2,... do 


-1 
alk +1] = (oo + LyTy) (oy 4 L yT isik = ott) 
7 y 
(4.109) 


z[k + 1] = S,4(Wal[k + 1] + v[k]) (4.110) 


v[k + 1] = v[k] + Ya[k + 1] — z[k + 1]. (4.111) 


If we compute the inverse matrix (O'® + y~!¥'W)—! offline (i.e. before 
the iteration), the above ADMM algorithm just includes matrix-vector multipli- 


cation, vector addition, and element-wise soft-thresholding. By this property, one 
can implement this algorithm in a small device and execute very fast. Moreover, if 
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the matrix PTO + y~! WT © is a tridiagonal matrix, the linear equation 
T lt 
P P+-Y Yjr=v (4.112) 
7 


with unknown g can be solved in O (n) [41, Section 4.3], and the first step (4.109) 
can be computed very efficiently. 


4.5.2 Total Variation Denoising 


Here we consider total variation denoising for images, which can achieve noise 
reduction and edge preserving at the same time. Let us assume we have a noisy 
image Y € R”*”, where each element in Y is the pixel value of the image of size 
n x m. From 2D image data Y , we pull out each column vector, say y € R”, and 
solve the following optimization problem, one by one: 


n—-1 
ae 2 
xL- A Xj41 — Xi |. 4.113 
minimize | yll5 + > [Xi41 il ( ) 


i=l 


The first term is the € error between æ and y for proximity to the data, while the 
second term is the £! norm of the difference, called the zotal variation, for flatness 
of the result. The optimization problem (4.113) is a special case of the generalized 
LASSO (4.95) with ® = Z (identity matrix) and 


-~1 1 0 0 
yal OY Oe el (4.114) 
i Fa a 0 
0 0 -1 1 


We can directly exploit ADMM (4.109)-(4.111) for this problem. Moreover, the 
matrix O! &+y —! ! Y is a tridiagonal matrix, and the algorithm can be executed 
very fast as mentioned above. 

The total variation, which is described by ||‘¥a||1, can be explained as a convex 
approximation of the £ 0 total variation ||P xllo. Minimizing the £ 0 total variation 
leads to a sparse difference vector, and hence the optimization result can be maxi- 
mally flat (i.e., the difference = 0 in all but a few pixels). A few nonzero differences 
come from image edges. That is, we assume that there are just a few edges in an 
image. 


82 Algorithms for Convex Optimization 


Original image Noisy image 


Figure 4.11. Original image (left) and noisy image (right). 


Restored image Restored image 


i 


Figure 4.12. Total variation denoising with 2 = 50 (left), 2 = 100 (right). 


Now, we show the results of total variation denoising. Figure 4.11 shows the 
original image and noisy image. The noise in the noisy image is so-called salt-and- 
pepper noise with noise density 0.05. Roughly speaking, about 5% of the original 
pixels are randomly replaced by black or white pixels. 

From the noisy image, we remove noise by the total variation denoising. We use 
the ADMM algorithm with y = 1. The maximum number of iterations in ADMM 
is set to N = 500. For the 2-D image, we first run total variation denoising hor- 
izontally and then vertically. That is, we run the algorithm twice for one image. 
Figure 4.12 shows the results of denoising with 4 = 50 and A = 100. If you take 
larger A, the variation between adjacent pixels will be smaller. Comparing images 
with 4 = 50 and 4 = 100 in Figure 4.12, the image with 4 = 100 gives an 


impression of more smoothness than that with 1 = 50. This effect is much more 
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Restored image 


Figure 4.13. Result of total variation denoising with 2 = 200. 


perceptible when 2 = 200. Figure 4.13 shows the result. The total variation term 
is too strong in this case, and the restored image is now unacceptable. 

In summary, to obtain a good result, you should carefully choose the parameter 
A, which affects the quality of denoising. However, there is no general rule for 
optimal A, and you should choose 4 by trial and error. 

MATLAB programs to do this example are available at the end of this chapter. 
You can experiment the total variation denoising by yourself. 


4.6 Further Reading 


To study convex optimization deeply, you can choose a renowned book by 
Boyd and Vandenberghe [13]. The PDF version of the book, lecture slides, and 
MATALB/Python programs for exercises can be available in 


http://web.stanford.edu/* boyd/cvxbook/ 


You can also choose a recent book by Bertsekas [7]. This book devotes much spaces 
to recent algorithms such as the proximal gradient algorithms and ADMM. If you 
need deep and mathematical knowledge of convex operation at a research level, 
you can consult the book [4] by Bauschke and Combettes. For proximal splitting 
algorithm, you can refer to [25, 92]. The book chapter [5] by Beck and Teboulle is 
a nice introduction of ISTA and FISTA. For ADMM, you can read the book [12] 
by Boyd et al. 
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MATLAB Programs 


To run the main program, you need Image Processing Toolbox in MATLAB. 


MATLAB program for Section 4.5.2 (Total variation denoising) 


%% Read image 

Img = imread(‘cat.jpg’); % read image 
X_orig = rgb2gray(Img); % Color to gray 
[n,m] = size(X_orig); % Image size 


%% Noise (salt & pepper) 
Y = imnoise(X_orig,’salt & pepper’,O.05); 


%% Display images 
igure; 
imshow(X_orig); 

itleC Original image’); 
igure; 

imshow(CyY); 

itleC Noisy image’); 


%% Denoising 

% optimization parameter 

ambda = 50; 

% matrix Phi and Psi 

Phi = eye(n); 

Psi = -diag(ones(n,1))+diag(ones(n-1,1),1); 


% ADMM iteration 

gamma = 1; % step size parameter 

N = 500; % number of iterations 

X_res = zeros(n,m); % restored image 

Z = zeros(n,m); V = zeros(n,m); % initial values 

M = sparse(Phi’*Phi + (//gamma)*Psi’*Psi); % sparse matrix 


% vertical processing 
Yv = double(y); 
W = Phi’*Yv; 
for k=1:N 
X = M\(W+gamma\Psi’*(Z-V)); 
P = Psi*X+V; 
Z = soft_thresholding(gamma*lambda,P); 
V=P-Z; 
end 


% horizontal processing 
W = rot90(X); 
for k =1:N 
X = M\(W+gamma\Psi’*(Z-V)); 
P = Psi*X+V; 
Z = soft_thresholding(gamma*lambda,P); 
V=P-Z; 
end 
X = rot90(X,3); 


%% Result 

figure; 
imshow(uint8(round(X))); 
title? Restored image’); 
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MATLAB function for Soft-thresholding operator Sj (v) 


function sv = soft_thresholding(lambda,v) 


[m,n] = size(v); 
mn = m*n; 
sv = zeros(m,n); 
for i = mn 
if abs(v(i))<=lambda 
sv(i) = O; 
else 
svi) = v(i) - sign(vCi))*lambda; 
end 
end 


DOE: 10.1561/9781680837254.ch5 


Chapter 5 


Greedy Algorithms 


In the previous chapter, we have formulated the problem of sparse representation 
as optimization problems with £! norm, for which there are efficient and fast algo- 
rithms. The idea was to approximate the non-convex and discontinuous €° norm 
by the convex £! norm. In this chapter, we consider alternative algorithms that 
directly solve the £°-norm optimization by using the greedy method. 


Key ideas of Chapter 5 


e Greedy algorithms are available to directly solve £? optimization. 

e The greedy algorithms introduced in this chapter show the linear con- 
vergence, which are much faster than the proximal splitting algorithms. 

e A local optimal solution is obtained by a greedy algorithm, which is not 

necessarily a global optimizer. 


5.1 £ Optimization 


Let us consider the following £° optimization problem: 
minimize ||a||o subject to Ox = y, (5.1) 
xeR” 
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where we assume ® € R”*” and y € R” are given. To consider this optimization 
problem, let us first define the mutual coherence of a matrix. 


Definition 5.1. For a matrix ® = [@1, @2,...,@n] E R””” with di € R”, 
i=1,2,...,n, we define the mutual coherence u(®) by 


Koi, $j) 


u(®) ê Uu 
bi ae on Wdillallejlla 
iA 


(5.2) 


The mutual coherence is the maximum value of the absolute value of the inner 


product of $j /||;||2 and 6; /||@j; lz. That is, 


mex | Qi $j ) 
leill’ llo;ll2 


The value Tene is the cosine of the angle 6;; between two lines along with œ; 


u(®) = (5.3) 


genes 


and @;. If the angle is small (i.e. coherent), then this value is large (close to 1), and 
if the angle is large (close to 90°, incoherent), then the value is almost 0. Figure 5.1 
illustrates these properties. Hence, the mutual coherence is described as 


u(®) = max |cosG;|. (5.4) 
i,j=l,..n 
if] 
Roughly speaking, if the vectors #1, ..., Ọn are uniformly spread in R”, then the 
mutual coherence u(®) is small. On the other hand, if some vectors in the dic- 
tionary are coherent like a tassel, then u (®) is large. Figure 5.2 shows dictionaries 


with large u (®) and small one. 
From Cauchy-Schwartz inequality 


I(x, y)| < llellallyll2, Yæ, y e R”, (5.5) 


Figure 5.1. Angle 6; between two lines along with ¢; and ¢;: coherent vectors (left) 
and incoherent vectors (right). 
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$ı po 


P3 


Figure 5.2. Dictionary {ġ1, p2, p3} with large u(®) (left) and small u(®) (right). 


the maximum value of the mutual coherence is 1. Since the equality in (5.5) holds 
if and only if the two vectors a and y are parallel, we have u(®) = 1 if and only 
if there exist parallel vectors in {@1, @2, ..., dn}. On the other hand, the mutual 
coherence is always non-negative, and u (®) = 0 if ® is a square orthogonal matrix. 

By using the mutual coherence, we can characterize the solution of the €° opti- 
mization (5.1) [37, Theorem 2.5]: 


Theorem 5.2. Jf there exists a vector x € R” that satisfies linear equation Dx = y, 
and 


1 1 
llællo < z (1 + <a) ; (5.6) 


then æ is the sparsest solution of the linear equation. 


By this theorem, let us consider properties of the solution(s) of the £? optimiza- 
tion problem in (5.1). 
First, let us assume u(®) < 1. That is, there are no parallel vectors in 


{h1, b2,..., Pn}. Then, we have 


1 1 


and hence if there exists a 1-sparse solution æ (i.e. ||ælļo = 1) of equation Ox = y, 
then this is the sparsest solution from Theorem 5.2. Now, we have 


y= Ox = x11 + x22 +: + XnGn, (5.8) 


and hence the 1-sparse solution is parallel to one of @1, @2,..., On. From this, 
we find @; that is parallel to y. This is formulated as a problem to find an index 
i e {1,2,...,m} that minimizes the error e(i) defined by 


eli) Ê min |x@j — yl3. (5.9) 
xeR 
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If there exists x with ||a||9 = 1, then there exists an index i € {1,2,...,m} that 
achieves e(i) = 0. Now, the minimum value of (5.9) can be easily obtained as 
follows: 
e(i) = min [lagi — yll 
xeR 


= min { (Øi, bi)x? — (di, y)x + (y, y)} 


xeR 
2 
(Qi, y) lhi yy (5.10) 
= min 4 leili x — > | +u- 5 
xeR Pills Pi lls 
= 5 . 
lgili 
From this formula, we can find one index i* that satisfies e(i*) = 0 (if it exists) by 
computing e(i) fori = 1,2,...,n. Then we have 
(dix, Y) 
y=x"*oe, x* 5 ; (5.11) 
l leil 
and the corresponding 1-sparse vector a* is given by 
i* 
Vv 
a* = [0,...,0,x*,0,..., 0]. (5.12) 


This computation requires O(n) computational time at the worst case. 
Let us generalize this observation. Assume that there exists a natural number k 
that satisfies 


u(®) < — (5.13) 
Then we have 
(1+ : ) > 5042-1 =% (5.14) 
2 u(®) 2 


Assume also that there exists a k-sparse solution (i.e. ||ælļo < k) of the linear equa- 
tion Ox = y. From Theorem 5.2, this is the sparsest solution. Then, the vector 
y is a linear combination of k vectors in the dictionary {@1, @2, . . . , dn}. As we 
have seen in Section 2.4 in Chapter 2 (p. 23), to find the k-sparse solution by the 
exhaustive search, we need (7) or O(n*) computations, which cannot acceptable 
in large scale problems. 

For such problems, a method called the greedy method is available. This method 
is an iterative method for a global solution, in which the locally optimal choice is 
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made at each stage. Although this method does not always give a global solution, 
this method is a powerful tool for combinatorial problems. In the next section, we 
introduce greedy algorithms for the £? optimization problem in (5.1). 


5.2 Orthogonal Matching Pursuit 


5.21 Matching Pursuit (MP) 


First, we introduce the simplest greedy algorithm called matching pursuit (MP for 
short) to solve the €° optimization problem in (5.1). This algorithm iteratively 
seeks a 1-sparse vector that is a solution of a local £? optimization problem. As 
mentioned above, a 1-sparse optimal vector is easily obtained with O(n) computa- 
tions. Matching pursuit aims at finding a global solution by iterating such an easy 
local optimization problem. 

The algorithm of matching pursuit iteratively approximates the solution of linear 
equation Da = y by decreasing the residual r[k] = y — Balk] at each step. The 
procedure is shown as follows: 


1. Find a 1-sparse vector [1] that minimizes |æ — y||2. 
2. Fork = 1,2,3,...do 


e Compute the residual r[k] = y — ®a[k] 
e Find a 1-sparse vector x* that minimizes ||®a — r[k]||2 and set 


xik +1] = x[k] + 2%. 


At the first step, we seek a 1-sparse vector æ[1] that minimizes ||®a — y|l2. Let 
x[1] be the non-zero element of [1] and i[k] the corresponding index, that is, 
i[l] 
y T 
x[1] = [0, ..., 0, x[1],0,...,0] Halley, (5.15) 
where e;, i € {1,..., n} is the standard basis in R” defined by 
vV 
e; =[0,...,0,1,0,...,0]' € R”. (5.16) 
Then, from (5.10), i[1] and x[1] are easily obtained as 


i[l] = argmin e(i) 


ie{l,...,n} 
i 2 
— arg min Iyl (Pi, y) 
= 2 
feted) I pill 
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= arg max Si i > 
ie{l,....n i 
Pa (5.17) 
gis SAURIA 
Pita lls 
The residual is given by 
r[1] = y — O2[1) = y — x[ 1 dif, (5.18) 
and we have 
y = x[1]i] + rl). (5.19) 


We can easily check (see Exercise 5.3 below) that the residual vector r[1] is orthog- 
onal to @ițı], and hence we have 


lyh = lxil + rH. (5.20) 
If the residual ||r[1]]]2 is sufficiently small, then 
GU) = x[]oin = Ox[1] (5.21) 
is a good approximation of y. Figure 5.3 illustrates this observation. 


Exercise 5.3. Prove that the residual vector r[1] is orthogonal to ¢j[1]. Also prove 
that the equation (5.20) holds. 


At the second step, we seek a 1-sparse vector that is the best approximation of 
the residual vector r[1] in (5.18). The 1-sparse vector is easily obtained by (5.10) 
with r[1] instead of y. Let a[2] be the 1-sparse vector, x[2] its non-zero element, 
and i[2] the corresponding index. Then we have 


(i, r[1]) _ (Qiz TLL) 


= er, x2] = (5.22) 
ie(i,...n) leili loial? 


hı . 


Po zillo 


Pz 


Figure 5.3. Vector y[1] = x[1]@2 with i[1] = 2 that is the best 1-sparse approximation of y. 
The residual vector r[1] is orthogonal to q2. 
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The residual vector r[2] is given by 
r[2] = r[1] — ®a[2] = r[1] — x[2] 412), (5.23) 
and from (5.19), we have 
y = x1] diy + x[2] i + r[2). (5.24) 
It is easily shown that @j [2] and r[2] are orthogonal to each other, and 
Iro = lxil + Ur (21113. (5.25) 
From this with (5.20), we have 
lvls = xph + pals + Ur 2113. (5.26) 
Now we obtain a 2-sparse vector 


ill] i[2] 
v v 
a[2] £[0,...,0,xf1],0,...,0,x[2],0,...,0]) = xfer + x[2]eip, 


(5.27) 
which gives a 2-sparse approximation of y as 
912) = xpi + x[2) 6:12) = Ox[2]. (5.28) 
Figure 5.4 illustrates this. 
If we continue the same procedure, we have at the k-th step 
y = x[l]pi + x[2] bi (2) + +--+ xk] bit + rik]. (5.29) 
Define the k-sparse vector by 
alk] = x[1eii + x[2]ei(2] +: -+ + x[k]ei]. (5.30) 


pı y r[2] 


Figure 5.4. Vector x[2]@3 with i[2] = 3 that is the best 1-sparse approximation of the 
residual vector r[1]. The residual vector r[2] is orthogonal to @3, and y = x[1]@2 + x[2]@3+ 
r[2] holds. 
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Then the vector y is approximated by using this k-sparse vector as 
glk] = xoin + x[21 big] + --- + xk bing = Salk]. (5.31) 


If the residual r[k] approaches to O as k — 00, then we obtain an approximated 
solution of the £? optimization problem in (5.1) by stopping the procedure at small 
k. The exhaustive search requires O(n*) computations to obtain k-sparse vector, 
while matching pursuit needs just O(nk) computations. 

We summarize the algorithm of matching pursuit as follows: 


MP for £? optimization (5.1) 


Initialization: Set x[0] = 0, r[0] = y,k = 1 
Iteration: while ||r[k]||2 > eps, do 


(bi, rik — 1])? 
[k] := SS 
ee lek 
a= (ik, rik - U 
Dita lls (5.32) 


x[k] := x[k — 1] +x[keiqey, 
r[k] := rik — 1] — x[k] ditx, 
k:=k+1 


In this algorithm, eps is the termination tolerance that should be fixed beforehand. 


Exercise 5.4. Prove that the following equality holds at the k-th step in the MP 
algorithm: 


k 


ys = Do ioil + Iriki. (5.33) 
j=l 


Moreover, show that if 1, h2,..., Ọn are normalized, that is 
lgill2=1, Vi €e{1,2,... n}, (5.34) 
then the following equality holds: 


k 


Iyl = $, xi? + Iriki. (5.35) 
j=l 


The following theorem gives the convergence property of the MP 
algorithm [71]. 
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Theorem 5.5. Assume that dictionary {Q1, Q2, ..., Pn} has m linearly independent 
vectors (i.e. rank(®) = m). Then there exists a constant c € (0, 1) such that 


irkilis luli &=0,1,2,.... (5.36) 
From this theorem, it follows that the residual r[k] monotonically decreases and 
lim r[k] = 0 (5.37) 

k= œ 


holds. 

The convergence rate in (5.36) is first order or linear, and the residual decreases 
exponentially, that is, O(c*). This rate is much faster than FISTA in (4.94) (p. 77) 
for the €! regularization, which has O(1/k*) convergence. 


5.2.2 Orthogonal Matching Pursuit (OMP) 


We have seen that the residual r[k] by the matching pursuit (MP) algorithm (5.32) 
decreases very fast. However, in general, it does not always achieve r[k] = O ina 
finite number of iterations, and the output vector æx[k] for large k, or limy_5 00 xk] 
may not be sparse. This is because MP may choose an index i[k] that was already 
chosen in previous steps. Orthogonal Matching Pursuit (OMP) is an algorithm to 
improve MP to achieve a finite number of iterations to obtain a sparse solution. 
This is done by removing an index from candidates if it was once chosen. Let us 
see the procedure of OMP precisely. 
At the k-th step in MP, we choose the index by 


' a 2 
i[k] = arg max Ae He Mr he) 


arg max PAE » rO]J=y, k=1,2,... (5.38) 


To memorize indices that were chosen in the previous steps, we define the set Sp of 
the chosen indices by k-th step as 
Sp = SU uk, So=O, k=1,2,... (5.39) 
Also, let us define a linear subspace Cg of R” spanned by vectors i, i € Sx, that is, 
C, £ span{ġ; : i € S} = S xii :x; eR}. (5.40) 
1ESK 


OMP approximates the vector y at each step by a vector in Cg, while MP approx- 
imates it by just one vector j,k]. More precisely, OMP chooses a vector y[k] in 
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Pı 


$ 
2 Ck = span{d; : i € Sk} 


Figure 5.5. The k-th step of OMP: find the best approximation gy[k] of y in the linear 
subspace Cr = span{ġ; : i € Sz}. The residual vector r[k] = y — y[k] is orthogonal to Cz. 


Cp that has the minimum €? distance from y. This is obtained by the orthogonal 
projection of y onto Cx: 


lk] = arg min ||v — yll5 = He, (y), (5.41) 


vECK 


where IIc, is the projection operator onto Cx. Figure 5.5 illustrates this projection 
at the k-th step. 
Using the restriction notation,’ we can characterize the condition v € Cx as 


v= > nA = 05,2, (5.42) 


TES, 


for some & € RÝ. Note that #(S,) = k holds as explained later. Then, the pro- 
jection in (5.41) is obtained by finding the coefficients of y[k] with respect to the 
basis functions @;, i € Sz in Cy. That is, we find 


z nd w 
a[k] = arg min z lPsŽ — yll. (5.43) 


zeRk 
This is the least squares solution? 
&[k] = (D£ Os) ‘oly. (5.44) 


Note that the matrix DL Ws, is always invertible (this will be explained later). Then 
gik] is given by 


is i —1 
JIk] = Ps F[k] = Ps, (V4, Ds) Of y. (5.45) 
Define the coefficient vector æ[k] € R” with respect to @j,i € {1,2,..., n} by 


(wIk]) 5, = Ik], (æ[k]) se = 0, (5.46) 


1. For the restriction notation, see (2.45), (2.46), and (2.47) in Chapter 2 (p. 24). 


2. For the least squares solution, see Section 3.1.2 in Chapter 3 and equation (3.23). 
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where Sf is the complement of Sp. Then we have 
JIk] = Oak]. (5.47) 
The residual vector r[k] = y — y[k] is given by 
rik] = y — glk] = {I — Ds, (D3 Os) OL Jy. (5.48) 


It is easily shown that the residual vector r[k] is orthogonal to the linear subspace 
Cx (see Figure 5.5), that is, 


(v,r[k]) =0, VuoeC,. (5.49) 


This means that any vector Q; in Ck will never be chosen by the maximization at 


the (k + 1)-th step: 


i, r[k]} i, P[K])? 
i[k+1]= arg max ae arg max aie (5.50) 
ie{1,2,...,.n) leili ie{1,2,...,.n} leili 
Pi FC 


since (@;, r[k]) = 0 holds for any ġ; € Cx, from (5.49). Also, we see that ¢j, 
i € Sp are always linearly independent since Qipk+1] ¢ Ck holds for any k, and 
hence P Ọs, is invertible. The name orthogonal matching pursuit comes from 
this property of orthogonality. 

We summarize the algorithm of OMP as follows. 


OMP for £? optimization (5.1) 


Initialization: Set x[0] = 0, r[0] = y, So = ð, k = 1 
Iteration: while r[k] 4 0 do 


i r[k — 1])? 
i[k] := arg max al a 
iefl,...,n} Pills 


Sk = Sk—1 U {i[k]}, 


> 


Bk] := (DF Ds) PSY, 
(x[k])s, := z[k], 
(a[k])s¢ := 0, 


rik] := y — Os, a[k], 


(5.51) 


k:=k+1 
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The following theorem shows that if there exists a sufficiently sparse solution of 
the equation ®a = y, then OMP gives the solution of the £? optimization (5.1) 
in a finite number of iterations [37, Theorem 4.3]: 


Theorem 5.6. Assume that ® € R”*" is surjective, that is, rank(®) = m. Assume 
also that there exists a vector x € IR” such that Dx = y and 


1 1 
lz Ilo < 5( + z5) (5.52) 


where u(®) is the mutual coherence of matrix ®©. Then, this vector x is the unique 
solution of the g? optimization (5.1), and OMP gives it ink = ||æ||o steps. 


We should note that at each step of OMP we need to compute the matrix inver- 


sion of (O Ps)! DL y. If the number k = ||a||9 in Theorem 5.6 is very large, 
then this inversion may impose a heavy computational burden. 


5.3 Thresholding Algorithm 


In this section, we consider the following optimization problems: 


1 

minimize =||®a — yll + Allællo (5.53) 
xeR” 2 
1 

minimize =||®a — yl|5 subject to |lallo < s (5.54) 
xeR” 2 


The first problem (5.53) is called the €° regularization, and the second problem 
(5.54) is called the s-sparse approximation. Note that these optimization prob- 
lems are non-convex and combinatorial. For these problems, we introduce efficient 
greedy algorithms by borrowing the idea of the proximal gradient algorithm studied 
in Chapter 4. 


5.3.1 Iterative Hard-thresholding Algorithm (IHT) 


Let us consider the following optimization problem: 
minimize fı (x) + fo(x), (5.55) 
xeR” 


where fi is a differentiable and convex function satisfying dom( f1) = R”, and fo 
is a proper, closed, and convex function. The proximal gradient algorithm for this 
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Figure 5.6. Hard-thresholding operator Ho (v). 
is given by” 
z[k + 1] = prox, ¢ (æ[k] — 7 V fı (x[k])). (5.56) 


For the £? regularization of (5.53), we have 


1 
fila) £ Li yl, fæ) Ê Allallo. (5.57) 


Although the function f2 is not convex in this case, we thoughtlessly apply it to 
the proximal gradient algorithm (5.56). Now, the proximal operator of f2(Œ) = 
Allællo has a closed form, namely the hard-thresholding operator (see Figure 5.6) 
defined by 


ipaa a lee (5.58) 
0, lui) <8, i=1,2,...,n, 


with 0 = ./2y 7, that is, 


prox, f (v) = Hy). (5.59) 


See Exercise 4.21, p. 68 for details. As shown in Figure 5.6, the hard-thresholding 
operator rounds small elements (|v;| < @) to 0. Figure 5.7 illustrates this operation. 
By using this operator, the proximal gradient algorithm for the €° regularization 
(5.53) is given as follows. 


IHT for €° regularization (5.53) 


Initialization: Give an initial vector æ[0] and positive number y > 0. 
Iteration: fork = 0,1,2,...do 


alk + 1] = H yy7(2lk] — y ®' (ak) — y)). (5.60) 


This algorithm is called the iterative hard-thresholding algorithm (IHT). 


3. See Section 4.4 (p. 73). 
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Figure 5.7. Hard-thresholding operator Hg(v) rounds small elements (|n;| < 0) to O, where 


O6=/f2yiA. 


For the convergence of the iterative hard-thresholding algorithm (5.60), the fol- 
lowing theorem is proved in [9]: 


Theorem 5.7. Assume that 
1 


y 


holds where \|®\| is the maximum singular value of ®. Then the sequence 
{x[0], x[1], x[2], ...} generated by the iterative hard-thresholding algorithm (5.60) 
converges to a local minimizer of the €° regularization (5.53). Moreover, the conver- 
gence is first order, that is, there exists a constant c € (0, 1) such that 


|a[k + 1] — a*|lo <clla[k] —2"*ll2, &=0,1,2,..., (5.62) 
where x* is a local minimizer. 


The condition in (5.61) is very similar to the condition in (4.93) (p. 77) for the 
proximal gradient algorithm for £! regularization. The convergence rate O(c*) of 
IHT is much faster than that of ISTA, O(1/k) or FISTA, O(1/k7). Note how- 
ever that IHT just gives a local minimizer, which is not necessarily equivalent to a 


global one. 


5.3.2 Iterative s-sparse Algorithm 


Here we consider the s-sparse approximation (5.54). By using the indicator func- 
tion (4.37) in Chapter 4 (p. 65), we rewrite the constrained problem of s-sparse 
approximation (5.54) as an unconstrained optimization problem. Let us denote by 
Xs the set of s-sparse vectors in R”: 


Z, = {æ ER": ||xllo < s}. (5.63) 
Exercise 5.8. Show that È, is a non-convex set. 


The indicator function Jy, for the set Xs is given by 


Is, (a) = f lelas (5.64) 


co, llælo > s. 
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Figure 5.8. s-sparse operator H;(v) with s = 3: the 3 largest elements in magnitude are 


unchanged and the other elements are set to O. The numbers 1, 2,3 indicates the rank of 
the absolute values of the elements. 


By using this, the s-sparse approximation (5.54) is equivalently described by 
ee. 2 
minimize =||®x — yll + Iz, (x). (5.65) 
xeR” 2 i 


Note that since Xs is non-convex (see Exercise 5.8), the indicator function Ty, is 
not a convex function. Anyhow, let us apply this to the proximal gradient algorithm 
(5.56). To do this, we should compute the proximal operator of the indicator func- 
tion I, , which is equal to the projection onto the set Zs. The projection is actually 
obtained by 


Is, (v) = arg min |x — v|l2 = Hs(v), (5.66) 


werd, 


where Hs (v) is the s-sparse operator that sets all but the s largest (in magnitude) 
elements of v to 0. Figure 5.8 illustrates this operation. Note that the projection is 
in general not unique. If s largest elements are not uniquely determined, then they 
can be chosen either randomly or based on a fixed ordering rule. 


Exercise 5.9. Prove that equation (5.66) holds. 


Let ys(v) denote the s-th largest elements of vector v e R”. Then the 
S-sparse operator Hs(v) can be represented by using the hard-thresholding opera- 
tor (5.58) as 


Hs (v) = H; w) w). (5.67) 


From this, the s-sparse operator is sometimes called as the hard-thresholding oper- 
ator. 

By using the s-sparse operator (5.66) as the proximal operator of the indicator 
function Ig,, we obtain the proximal gradient algorithm (5.56) for the s-sparse 
approximation (5.65). 
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Iterative s-sparse algorithm for s-sparse approximation (5.54) 


Initialization: Give an initial vector æ[0] and a positive number y > 0 
Iteration: fork = 0, 1,2, ... do 


alk + 1] = Hs (x[k] — y D' (@a[k] — y)). (5.68) 


We call this algorithm the iterative s-sparse algorithm. Note that this is sometimes 
called the iterative hard-thresholding algorithm (IHT). 
For the iterative s-sparse algorithm, we have the following convergence theo- 


rem [9]. 


Theorem 5.10. Assume that the matrix ® € R”*" is surjective, that is, rank(®) = 
m, and the column vectors Qi, i = 1,2,...,n, are non-zero, that is, 


loill2 > 0, Vi e{1,2,..., n}. (5.69) 


Assume also that the constant y > O satisfies 


1 


y 


Then the sequence {x[0], x[1], £[2], ...} generated by the s-sparse algorithm (5.68) 
converges to a local minimizer of the s-sparse approximation problem (5.54). Moreover, 
the convergence is first order, that is, there exists a constant c € (0, 1) such that 


lælk +1] — @*l]2 < clæ[k]— æ*ll2, k=0,1,2,..., (5.71) 


where x* is a local minimizer. 


5.3.5 Compressive Sampling Matching Pursuit (CoSaMP) 


For the s-sparse approximation (5.54), we can extend the algorithm of OMP in 
Section 5.2.2 with the s-sparse operator Hs. This algorithm is called the compressive 
sampling matching pursuit (CoSaMP). 

In the OMP algorithm (5.51), we first choose one index i[k] as 


(Qi, r[k — 1])* 


i[k] = arg max (5.72) 
jell) llors 
Alternatively, CoSaMP chooses 2s largest values of 
i rik — 1])° i ; 
(di, rl : ee as $ He n) l 65.73) 
Pills Il Pi ll 
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and includes these 2s indices in the index set Sz, that is, 


Qi 2 
Sk = Sk- usap [a (9 r[k — n) )} (5.74) 


As in OMP, we then find the projection of y onto the linear subspace Cy = {@j : 
i € Sk}. That is, 


&[k] = (D$ Os) OL y. (5.75) 


From this, we define an n-dimensional coefficient vector z[k] as 


a | (EK); ie Se. 
ki = : .76 
(zIk]) i on (5.76) 


Note that the number of nonzero coefficients in z[k] is larger than 2s. We then 
prune z[k] to an s-sparse vector æ[k] as 


z[k] = Hs (z[k]). (5.77) 
Also, we update the index set Sk to 
Sk = supp(ax[k]). (5.78) 


Finally, we obtain the CoSaMP algorithm to solve the s-sparse approxima- 
tion (5.54). 
CoSaMP algorithm for s-sparse approximation (5.54) 


Initialization: Set x[0] = 0, r[0] = y, So = ð 
Iteration: fork = 1,2,...do 


Z a 
nn = spp | (( (2 d uy). 


Sk := Sk-1 UT[K], 
&[k] := (OL s) O} y, 
(z[k])s, := Z[k], (5.79) 
(z[k])s¢ := 0, 
x[k] := Hs (z[k]), 
Sk = supp{x[k]}, 
rik] := y — gs, z[k]. 


For the convergence of the CoSaMP algorithm, see the original paper [88]. 
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5.4 Numerical Example 


Here we solve sparse optimization numerically by using the greedy algorithms stud- 

ied in this chapter. Let us consider the problem of curve fitting studied in Section 
80 

—t 


3.3 (p. 48). with the sparse polynomial y = + t. The data are given as in 


Section 3.3, the data points are given by 
ti =0.1(i— 1), i=1,2,...,11, (5.80) 


from which we reconstruct the 80-th order polynomial. Here we consider the fol- 
lowing 6 algorithms: 


e! optimization considered in Section 3.3 (p. 48) 
. matching pursuit (MP) 

. orthogonal matching pursuit (OMP) 

. iterative hard-thresholding (IHT) 

. iterative s-sparse algorithm (ISS) 


WN BR O&O NHN 


. compressive sampling matching pursuit (CoSaMP) 


The matrix P e R!!*8! is given by (3.16), which satisfies 


1 
0.012 < —; < 0.013, (5.81) 
|| ||? 


and we choose the parameter y for IHT and ISS as 


1 
y = 0.01 < 0.012 < TA (5.82) 
From Theorems 5.7 and 5.10, the condition (5.82) guarantees convergence to local 
minimizers for IHT and ISS. We also choose 4 in the £ regularization problem as 
A = 0.001. 

Figure 5.9 shows the coefficients obtained by the algorithms. The coefficients 
are ordered from the highest degree to the lowest degree. We see that the £! opti- 
mization, MP, OMP, and CoSaMP give exact coefficients, while IHT and ISS show 
incorrect reconstruction. To see this more precisely, we check the estimation error 
r = y— 2", where x* is the obtained vector when the algorithm stops. Table 5.1 
shows the error with the number of iterations required to achieve the error. 

All but £! optimization stop the iteration when the error ||r[k]||2 is less than 
1075 or the number of iterations is larger than 10°. 

IHT and ISS attained the maximum number of iterations 10°, and their errors 
are much larger than those of the other methods. This is because they were trapped 
into local minimizers. The other methods show fast convergence, among which 
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Figure 5.9. Estimation of sparse coefficients. 


Table 5.1. Estimation error |y — ®x*||2 and number of iterations. IHT and ISS reached the 


maximum number 10° of iterations. 


Methods £! OPT MP OMP IHT ISS CoSaMP 
Error 2.7 x 10719 911075 41x 1071 0.0017 0.83 4.1 x 1071! 
Iterations 10 18 2 10° 10° 3 


OMP (2 iterations) and CoSaMP (3 iterations) especially present surprising results. 
In view of the error and the number of iterations, OMP is the best method in this 
case. It should be noted that greedy algorithms do not necessarily give a global 
solution. OMP is the best in this case but in other cases, another method may be 
the best. This depends on the problem and data, and we should adopt trial and 
error to seek the best algorithm. 
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5.5 Further Reading 


Basics of greedy algorithms can be found in [26, 62]. For the characterization of £? 
optimality by using the mutual coherence, or the restricted isometry property (RIP), 
which is not studied in this book, you can refer to [37, 38]. 

The matching pursuit (MP) was first proposed in [71], the orthogonal matching 
pursuit in [28, 94]. For the iterative hard-thresholding algorithm and the iterative 
s-sparse algorithm, see the paper [9]. The compressive sampling matching pursuit 
(CoSaMP) was proposed in [88]. 


MATLAB function of MP 
MP.m 


function [x,nitr]=MPcy,Phi,EPS,MAX_ITER) 
[m,n] = size(Phi); 
x = zeros(n,]); 
rey; 
k=0; 
Phi_norm = diag(Phi’*Phi); 


while (norm(r)>EPS) & (k < MAX_ITER) 
p = Phi’*r; 
v = p./sqrt(Phi_norm); 
[z,ik] = max(abs(v)); 
v2 = p./Phi_norm; 
z = v2(ik); 
x(ik) = x(ik)+z; 
r = r-z*PhiC,ik); 
k = k+1; 

end 

nitr=k; 

end 
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MATLAB function of OMP 
OMP.m 


function [x,nitr]=OMP(y,Phi,EPS,MAX_ITER) 
[m,n] = size(Phi); 
x = zeros(n,]); 
ray, 
k=0; 
S = zeros(n,1); 
Phi_norm = diag(Phi’*Phi); 
while (norm(r)>EPS) & (k < MAX_ITER) 
p = Phi’*r; 
v = p./sqrt(Phi_norm); 
[z,ik] = max(abs(v)); 
Sik) = ik; 
Phi_S = Phi(:,S(S>0O)); 
x(S(S>0)) = pinv(Phi_S)*y; 
r = y-Phi*x; 
k = k+1; 
end 
nitr=k; 
end 


MATLAB function of hard-thresholding operator H; (v) 
hard_thresholding.m 


function hv = hard_thresholding(lambda,v) 
[m,n]=size(v); 
mn = m*n; 
hv = zeros(m,n); 
for i = 1:mn 
if abs(v(i))<=lambda 
hv(i) = O; 
else 
hv(i) = vi); 
end 
end 
end 


Further Reading 107 


MATLAB function of the support of a vector 


supp.m 


function | = supp(x) 
| = find(abs(x)>0)’; 
end 


MATLAB function of iterative hard-thresholding algorithm 
IHT.m 


function [x,nitr]=IHT(y,Phi,lambda,gamma,EPS,MAX_ITER) 
[m,n] = size(Phi); 
x = zeros(n,1); 


r=y; 
k= 0; 
while (norm(r)>EPS) & (k < MAX_ITER) 
p = x + gamma * Phi’*r; 
x = hard_thresholding(sqrt(2*lambda*gamma),p); 
S = supp(x); 
r = y-PhiC:,S)*x(S); 
k = k+1; 
end 
nitr=k; 
end 
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MATLAB function of s-sparse operator 


sS_sparse_operator.m 


function y = s_sparse_operator(x,s) 
[n,m]=size(x); 
y=zeros(n,m); 
[xs,indx]=sort(abs(x),1, descend’); 
indx_s = indx(l:s); 
yCindx_s)=x(indx_s); 

end 


MATLAB function of iterative s-sparse algorithm 


iterative_s_sparse.m 


function [x,nitr]=iterative_s_sparse(y,Phi,s,gamma,EPS,MAX_ITER) 
[m,n] = size(Phi); 
x = zeros(n,]); 
ray, 
k=0; 
while (norm(r)>EPS) & (k < MAX_ITER) 
p = x + gamma * Phi’*r; 
x = S_sparse_operator(p,s); 
S = supp(x); 
r = y-Phi(:,S)*x(S); 
k = k+1; 
end 
nitr=k; 
end 


Greedy Algorithms 


Further Reading 109 


MATLAB function of CoSaMP 
CoSaMP.m 


function [x,nitr]=CoSaMP(y,Phi,s,EPS,MAX_ITER) 
[m,n] = size(Phi); 
x = zeros(n,]); 
rey; 
k=0; 
s= p: 
Lambda = []; 
Phi_norm = diag(Phi’*Phi); 
while (norm(r)>EPS) & (k < MAX_ITER) 
p = s_Sparse_operator((Phi’*r)./sqrt(Phi_norm),2*s); 
Ik = supp(p); 
S = union(Lambda,!k); 
Phi_S = Phic:,S); 
z = zeros(n,1); 
Z(S) = pinv(Phi_S)*y; 
x = §_sparse_operator(z,s); 
Lambda = supp(x); 
r = y-Phi_S*z(S); 
k = k+1; 
end 
nitr=k; 
end 


MATLAB code for the simulation in Section 5.4 


clear; 
%% data 
% polynomial coefficients 
x_orig = [-1,zeros(1,78),1,0]’; 
% sampling 
t= 0:0.1:1; 
y = polyval(x_orig,t)’; 
% data size 
N = length); 
M = N-1; 
% Order of polynomial 
M_I = length(x_orig)-1; 
% Vandermonde matrix 
Phi=[]; 
for m=0:M_l 
Phi = [t’.°m,Phi]; 
end 
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%% Sparse modeling 
% iteration parameters 
EPS=1e-5; % if the residual < EPS then the iteration will stop 
MAX_ITER=100000; % maximum number of iterations 
% L1 by CVX 
cvx_begin 

variable x_l1(M_I+1) 

minimize norm(x_11,1) 

subject to 

Phi*x_lI1 == y 

cvx_end 
% Matching Pursuit 
[x_mp,nitr_mp]=MP(y,Phi,EPS,MAX_ITER); 
% OMP 
[x_omp,nitr_omp]=OMP(y,Phi,EPS,MAX_ITER); 
% CoSaMP 
s = length(supp(x_orig)); 
[x_cosamp,nitr_cosamp]=CoSaMP(y,Phi,s,EPS,MAX_ITER); 
% IHT 
lambda=0.001; 


gamma=0O.01; 
[x_iht,nitr_iht]=IHTCy,Phi,lambda,gamma,EPS,MAX_ITER); 

% iterative s-sparse 

gamma=0O.01; 
[x_iss,nitr_iss]=iterative_s_sparse(y,Phi,s,gamma,EPS,MAX_ITER); 


DOI: 10.1561/9781680837254.ch6 


Applications of Sparse Representation 


In this section, we introduce applications of sparse representation in a finite- 
dimensional space for systems and control. 


6.1 Sparse Representations for Splines 


In this section, we consider curve fitting by using splines. As discussed in Chapter 3, 
we consider the following two-dimensional data: 


D = {(t1, y1), (t2, Y2), - - +s (fm, Ym)}; (6.1) 


where 0 < ty < t2 < --- < tm = T are sampling times, and y1, y2,..., Ym are 
obtained from the following observation: 


yi = ylti) +é, i=1,2,...,m, (6.2) 


where y is a function and €; is additive noise. The problem of curve fitting is to 
estimate the unknown function y form data D. 

In Chapter 3, we have assumed the function is a polynomial function, and shown 
that with a fixed order of the polynomial, the problem becomes a convex optimiza- 
tion. Here we seek a function among more general functions called splines. Namely, 
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we consider the following optimization problem: 


m T 
minimize > vt) — yil? + af Odt, (6.3) 
y 0 


i=l 


where we assume the second derivative j is in L? (0, T). The first term is for the 
fidelity of curve fitting to the data, and the second term is for the smoothness of the 
curve. In general, if you increase the fidelity then the curve becomes less smooth, 
and hence we need to control the trade-off between them to appropriately choose 
the parameter A > 0. 

Note that since y is not a finite-dimensional vector but a function, the problem is 
an infinite-dimensional problem. However, to use techniques in Hilbert space theory, 
the problem can be reduced to a finite-dimensional optimization problem. Let us 
first show this in this section. For this, we introduce the formulation of control 
theoretic splines [36, 108]. 


6.1.1 Solution by Projection Theorem 


First, let us define 


xi(t)S y(t), (290), ult) SO. (6.4) 


Then, the optimization problem can be described as 


m T 
minimize Dba w+. f u(t)|°dt 


ueL2(0,T) icl 


(6.5) 
subject to &(t) = Aa(t)+bu(t), y(t)=c'a(t), te[0,T] 


r0) = 0 


where x(t) £ [x1(t), x2(t)|' and 


afo d afo afl 
aaf, af), ca]. «66 


Note that this formulation is for more general optimization than (6.3) by choosing 
another set of (A, b, c). 
Define 


TeAC—DB if O<t< 
na fe 5 Se (6.7) 


, otherwise 
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and 
di(t) ELl, ti), i=1,2,...,m. (6.8) 
Then we have 


T 
y(ti) = (Qi, u) 22 =f Gi(t)u(t)dt, i=1,2,...,m. (6.9) 
0 


From this, the problem (6.5) becomes 


m E 
minimize È |(ġ;, u) 1.2 — wt af |u(t)|7dt. (6.10) 
=] 0 


ueL?(0,T) m 


Then, if we define z; = (¢;, u) z2, the optimization problem is described as 


m T 
minimize > lzi — yil? + af u(t) dt 
: 0 

i=] 


ueL2(0,T) (6.11) 
subject to z; = (Qi, u)ņ2, i= 1,2,...,m. 
Define a new Hilbert space H by 
H = L?(0, T) x R”, (6.12) 


with inner product 


T 
(4 ; Be 2w z +f u(t)o(t)dt. (6.13) 


Then, consider a closed linear subspace M of H defined by 
M ê H EH: z= (gi, (6.14) 

and a vector p € H defined by 
p= H €H, (6.15) 


where y = [y1, Y2, .-., Ym]! € R”. Then, for r Ê (u, z) € H, we have 


m T 
Ir pI, = X lzi — yil? + af lu(e)l?dt, (6.16) 
i=] 


where || - || #7 is the norm induced by the inner product (-, -) g, that is 


Ir- plia = V(r —p,r— p)n. (6.17) 
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M+ 


M++p 


Figure 6.1. Projection theorem: the projection of p onto M is given by r* e (Mt + p)nM. 


The optimization problem (6.11) is now rewritten as 
minimize ||r — Pilar subject to re M. (6.18) 
reH 


The minimizer is given by the projection of p € H onto the closed linear subspace 
M C H. Let M+ denote the orthogonal complement of M in H. That is, 


He e 


Then, from the projection theorem, the minimizer r* is in the set (M t4pyn 
M (see [36, Section 2.3]). Figure 6.1 illustrates this property form the projection 
theorem. 

Now, let us characterize the set M4 + p. Take (v, w) € M+. Then, from (6.14), 
for any (u, z) € M, we have 


= (Le) EE) 


T 
=wilz+ af v(t)u(t)dt 
0 
(fi, u) 72 
(p2, U) 12 
= wl E + 1{v, u) 72 (6.20) 
(m> u) 72 


= > wi (Pi, U)p2 + A(v, u) 72 


m 
i= 


1 
2 wihi + Av, ‘| 
i=l 


L2 
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This equation holds for any u € L?(0, T), and hence we have 


Š wigi + Av =0, 


i=l 


or 
1 m 
i= 


From this, the subspace M+ can be represented by 


1 
Mt = [7 ~ w eR"| i 


w 


and also we have 


1 
M++p= ieee ad w eR". 
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(6.21) 


(6.22) 


(6.23) 


(6.24) 


Now, let us obtain the minimizer r* = (u*, z*) € (M+ + p) N M of (6.18) 


(see also Figure 6.1). First, since r* € M, we have 
zš = (ġi yas i=1,2,...,m. 


Next, since r* € M+ + p, we have 


1 m 
us = z > wi i 
i y—t 


Zz; = wit yi 


Inserting (6.26) into (6.25) gives 


i i 
a= (4-7 i] = ja wjlpi, Pit 
L2 


From (6.27), we have 


1 


T32" (Qi Pj) L2 = Wi + Yi, 


or 


(I + G)w = —4y, 


(6.25) 


(6.26) 


(6.27) 


(6.28) 


(6.29) 


(6.30) 
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where G is the Gram matrix defined by 


(di: Pı) (Pi: P2) --. (Pi, Pm) 
(d2, Pı) (2: p2) ... (P2, Pm) 


GA . (6.31) 
Uist). AiG. ain, Winds 
Since J > 0, the matrix AI + G is non-singular, and hence 
w= —-A(AL+G) ly. (6.32) 
Finally, from (6.26), we obtain the solution 
u* = -} Spo +G)'Ay].¢i 
i=l 
= X [(1+G)'y],4i (6.33) 
i=1 
z 5 a; pi 
i=1 
where 
ay 
a* =| : |= 0I+6 ly. (6.34) 
at 


m 


The important point of the solution is that the optimal solution of the infinite- 
dimensional optimization problem in (6.5) is describe as a finite number of spline 
functions $1, ..., Pm and the problem is reduced to computing the unknown coef- 
ficients a}, ..., aṣ}. In other words, the original problem (6.5) is fundamentally a 
finite-dimensional optimization problem. Note that this property is generalized to 
the representer theorem in statistical machine learning [104]. 

Finally, the optimal solution y* of the original optimization problem (6.3) is 
given by 


t T 
y*(t) =, f u*(s)dsdt. (6.35) 
0 JO 
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6.1.2 Sparse Representation 


From (6.33), the number of coefficients is equal to m, the number of data. If the 
data is very big (i.e., m is very large), then we need many coefficients to repre- 
sent the fitting curve y(t). Then, to use the idea of sparse representation, we can 
reduce the number of coefficients. For this, we restrict the feasible solutions of the 
optimization problem (6.10) to be 


u(t) = Do zigi), (6.36) 


i=l 


where Z1,..., Zm are unknown coefficients to be obtained. With this, we have 


(Gi,4)22 = lv Yi = > alén (6.37) 
j=l po j=l 


and hence 
(Pi, U) 72 
ee = Gx (6.38) 
(ns te) 
and 
S dip =n Silene 6.39) 


i=l 
Also, we have 


m 


T T m 
if uoPar =a f (Sem) 2210310) dt 


(6.40) 


=AD > aoine 


i=l j= 


= 


Therefore, under the assumption of (6.36), the optimization problem (6.10) is 
rewritten as 


minimize IGz — yl? + 4z! Gz. (6.41) 
ZE m 
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Then, to promote the sparsity of z, we add the £? norm as a regularization term: 


minimize ||Gz — yll? + 4z'Gz + pllzllo, (6.42) 
zE m 


where p > 0 is the regularization parameter. As usual, we can adopt the £! norm 
as convex relaxation of the £° norm. The relaxed convex optimization problem is 
described as follows: 


minimize ||Gz — yll? + Az! Gz + plizll1. (6.43) 
zeR™” 


This can be easily solved by the proximal gradient algorithm studied in 
Section 4.4 (p. 73). 


6.2 Discrete-time Hands-off Control 


In this section, we introduce sparse control (or hands-off control) for discrete-time 
systems. 


6.21 Feasible Control 


Let us consider a linear time-invariant system described by 
a[k + 1] = Aa[k]+ bu[k], k=0,1,2,...,n— 1, (6.44) 


where æ[k] € Rf is the state and u[k] € R is the control at time step k € 
{0,1,2,...,n — 1}. The matrix A € R?*@ and the vector b € Rf are assumed 
to be exactly known. The number n is the orizon length of the system. We call the 
difference equation (6.44) the state equation. 

Assume that the initial state x[0] = € is given by state observation. Then the 
control objective is to find a control sequence {u[0], u[1], ..., u[n — 1]} such that 
the control drives the state z[k] from [0] = §€ to the origin, that is, 


x[n] = 0. (6.45) 


From the state equation (6.44), we have 


k-1 k-1 
a[k] = Atæ[0] + X` AX!‘ buli] = AXE + X AN! buli], (6.46) 
i=0 i=0 
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fork e {0,1,...,2—1}. Then, the terminal constraint (6.45) can be described as 


n—-1 
a[n] = A"€+ >) A"! buli] = A"E + Ou =0, (6.47) 
i=0 
where 
u[O] 
u[1] 
@= [Ah A"?b ... Ab b], uê . (6.48) 
u[n — 1] 


Then the set of feasible controls that achieve (6.45) is given by 
U(n, £) = {we R” : A"E+ Ou =O}. (6.49) 
For the feasibility, we have the following lemma. 
Lemma 6.1. Suppose n > d and the following matrix M is nonsingular:' 
M+=[b Ab ... At 1b] e R?” (6.50) 
Then the feasible setU(n, €) is non-empty for any € € R¢. 


Proof: Since n > d and the matrix M is non-singular, the matrix ® in (6.48) has 
full row rank. It follows that ® is surjective and there exists at least one vector u 
that satisfies Du = —A"€ for any € e R. 


6.2.2 Maximum Hands-off Control 


The optimal control is a problem to seek the optimal solution(s) that minimizes a 
cost function among control vectors in U (n, £). A general form of the cost function 
is given by 
n-1 
J(u) = >) L(æ[k], u[k]), (6.51) 
k=0 
where the function L is called the stage cost function. 
The near quadratic control, or LQ control for short, has the following stage cost 
function 


L(a,u) =a! Ox +r\ul’, (6.52) 


1. The matrix M is called the controllability matrix, and the pair (A, b) is called controllable if M is non-singular. 
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where Q € RI*4 is a positive semidefinite matrix, and r > 0. In this section, we 
are interested in sparse control, also known as maximum hands-off control, which has 
the following stage cost function: 


L(a, u) = |ul°, (6.53) 


with which the cost function is given by 


n—-1 


J(u) = >> |ufk]|° = lullo. (6.54) 


k=0 


The optimization problem is then described as 


minimize lullo subject tou € U (n, €). (6.55) 
uel” 


As usual, we approximate the €° optimization by 


minimize || w\|1 subject tou € U(n, €), (6.56) 
welR” 


which is the £! optimization problem discussed in Section 4.3, which is efficiently 
solved by Douglas-Rachford splitting algorithm (see Section 4.3.1). 
Also one can consider the following cost function 


n—-1 
J(u) = > (xik Qectke] + Aluteh (6.57) 


k=0 


with positive semidefinite Q € Rid and À > 0. Inserting (6.46) into (6.57), we 
have 


J(u) =u! Ru +2q'u + Aljulli +c, (6.58) 


for some R € R"*", q € R”, and c € R. For this optimization, we can apply the 
ADMM algorithm discussed in Section 4.5. 


6.2.3 Model Predictive Control 


As discussed above, the control sequence u € R” is obtained by numerical opti- 
mization with given initial state observation € € Rf. Let denote by C the mapping 
from the initial state € € Rf to the optimal control sequence u € R”, that is, 


u = C(€). (6.59) 
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Then u = C(€) is a finite-horizon control (i.e., the control is applied to a plant 
in a finite length of time), and this is open-loop control. Open-loop control is some- 
thing like riding a bicycle with your eyes closed, which is very fragile against distur- 
bances. To make the control system robust, you need to implement the control as 
feedback control, where the controller constantly observes the state and update the 
control based on the latest state observation. 

To implement feedback control from the finite-horizon control u = C(&), we 
adopt the model predictive control (also known as receding horizon control). 

The model predictive control is described as follows: 


1. Observe the state z[k] at time k. 
2. Compute the optimal control sequence 


uolk | 
uy[k] 


ulk] = = C(x[k]). (6.60) 


Un—1[k] 


3. Use the first element of u[k], that is, uo[k], as the control at time k. 


From this, the control u[k] to the discrete-time plant (6.44) is obtained by 
u[k] = uo[k]=[1 0 ... O]C(af[k]). (6.61) 


Figure 6.2 shows the block diagram of the feedback control system where P is the 
plant. 

The important thing we should do next is to study the stability of the feedback 
system. The closed-loop system in Figure 6.2 may exhibit instability, that is, æ[k] 
may diverge, if we do not care about the stability. The instability is possible even 
when P and C are both stable. Therefore, to prove the stability is very important 
to design a feedback control system. 

First, we define the value function V (€) of the optimal control problem (6.56) by 


vV) ê n lull. (6.62) 


Figure 6.2. Feedback control by model predictive control u[k] = C(a[k]). P is the plant 
given by (6.44). 
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We have the following lemma: 


Lemma 6.2. Assume that the controllability matrix M in (6.50) and the matrix A 
in (6.44) are non-singular. Assume also thatn > d. Then the value function V (&) is 
convex, continuous, and positive definite. 


Exercise 6.3. Prove Lemma 6.2. 
Now, we give a detailed definition of stability. 
Definition 6.4. Let us consider the following discrete-time system 
alk +1] = f(elk)), &=0,1,2,... (6.63) 


Suppose that there exists the unique sequence {x[0], x[1], . . .} satisfying (6.63) for any 
initial state £[0] € R. Suppose also that the origin is an equilibrium of the system, 
namely, f (0) = 0 holds. Then the origin is said to be stable if for each € > O there 
exists 0 > O such that 


llæ[0]ll2 < 6 = llæfk]ll2 < €, Vk > 0. (6.64) 


The concept is very simple; the state trajectory {a[k]}7° 9 starting out near the 
origin will keep on staying near the origin and never diverge. 
From (6.44) and (6.61), the closed-loop system is described as 


alk +1] = Aalk]+[1 0 ... 0] C(a[k]) = f (ak). (6.65) 


It is easily shown that the origin O is an equilibrium of this difference equation. 
To show the stability of this equilibrium, Lyapunov’ theorem is available. 


Theorem 6.5. Suppose that there exists a function V : RÌ —> R satisfying 


V(0) =0. 

V (£) is continuous. 

V (E) > 0 for any #0. 

V(a[k+1]) < V(alk)) fork = 0, 1,2, ..., for the state trajectory {x[k ho 
of the system (6.63). 


m oe 


Then the origin O is stable under the system equation (6.63). 


A function V in Theorem 6.5 is called a Lyapunov function. The idea to prove the 
stability of our system (6.65) is to show the value function (6.62) to be a Lyapunov 
function. In fact, it is a Lyapunov function and we have the following theorem. 


Theorem 6.6. Assume M and A are non-singular, and n > d. Then the origin is 
stable under the system equation (6.65). 
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Proof: We prove the value function V (€) in (6.62) is a Lyapunov function of 
(6.65). The properties 1 to 3 in Theorem 6.5 are directly from Lemma 6.2. We here 
prove 4. Let 


u*[k] È [uš[k] ui[k] ... už [k] = Cate), (6.66) 
and define 
alk] © [ut] Stk]... ut, O]'. (6.67) 


Note that ŭ[k] is a shifted control sequence by one time step of the optimal control 
sequence u*[k] at time k. It is then easily shown that w[k] is a feasible control for 
a[k + 1], that is, 


alk] € U(n, sik + 1). (6.68) 
In fact, since u*[k] € U(n, æ[k]) we have 
A" alk + 1] + O&[k] 
= A” (Ax[k] + buğ[k]) + A" but [k] + --- + Abu*_,[k] 
= A(A"a[k] + A"! budlk] + A" but [k] + - +» + bu*_;[k]) 


(6.69) 
= A(A"a[k] + ®u*[k]) 
=Ax0 
= 0. 
Then, from the optimality of the value function, we have 
V (alk + 1]) = min{|lull; : u € Un, ak + 1))} 
< l&k] 
= |u [k] + luzik] +--+ + lui k] + 10] 
n—l (6.70) 
= $ lui [kl] — [uptkl 
i=0 
= V(alk]) — |wolk]| 
< V(æ[k]), 


for k =0,1,2,... 
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6.3 Further Reading 


The control theoretic smoothing spline was first proposed in [108]. The book [36] 
is a nice reference for the smoothing spline. The convex optimization formulation 
of constrained smoothing spline was considered in [78], and the sparse representa- 
tion was proposed in [79]. 

The maximum hands-off control was first proposed in [83] for continuous-time 
and discrete-time systems. Detailed discussions of maximum hands-off control for 
continuous-time systems can be found in Part II of this book. The model predictive 
control formulation was proposed in [86]. 


Part || 


Sparsity Methods in Optimal 
Control 


DOI: 10.1561/9781680837254.ch7 


Chapter 7 


Dynamical Systems and Optimal Control 


We have studied the idea and algorithms of sparse optimization in Part I. In part II, 
we will extend the sparsity methods to dynamical systems. For this, we here review 
basics of dynamical systems and optimal control, before we consider sparse control 
in the subsequent chapters. 


Key ideas of Chapter 7 


e A dynamical system is modeled by a differential equation called the state- 
space equation. 

e We cannot control uncontrollable systems. 

e Optimal control is the best control among feasible controls for a con- 

trollable system. 


7.1 Dynamical System 


A dynamical system is a system that depends on time. That is, a dynamical system 
is a moving system. Dynamical systems are around us; industrial products such 
as vehicles, airplanes, motors, electric circuits, etc, as well as movement of plane- 
tary, change of weather, ant swarm, cell movement, fluctuations in stock prices and 
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spread of virus. Dynamical systems are important not only in engineering but also 
in physics, biology, economics, and social science. 


711 State Equation 


In this book, we focus on a dynamical system that is described by a linear differential 
equation: 


x(t) = Aa(t)+bu(t), t>0, 2(0)=€eRY%, (7.1) 


where A € R2*4, b e RI, x(t) e RI, and u(t) € R. We call x(t) the state, and 
u(t) the control. The state x(0) = € at time t = 0 is called the initial state, and the 
differential equation in (7.1) is called the state equation. The dynamical system in 
(7.1) is controlled by the control u(t), and called a controlled object or a plant. 


Exercise 7.1. Show that the solution of the differential equation (7.1) is given by 


t 
x(t) =e“ E +f eA bu(t)dt, t>0. (7.2) 
0 


Example 7.2 (Rocket). Let us consider the control of a rocket in the outer space where 
no friction nor gravity acts (see Figure 7.1). The rocket is accelerated by thrust from a 
rocket engine. Let the mass of the rocket be m [kg]. We assume that the rocket can move 
on 1-dimensional straight line. Let the position of the rocket at timet > 0 ber(t) with 
initial position r(O) = &1, and initial velocity v(0) = r(O) = č. We denote the 
thrust force by F (t). From the Newton’ second law of motion, we have' 


mř(t) = Ft), rO)=a, 10)=co. (7.3) 
nè Fej ? 
r(0) = & rocket 


Ż(0) = & r(t), P(t) 


Figure 7.1. Rocket example. 


1. Strictly speaking, the thrust of a rocket is obtained by emitting its mass (e.g. fuel) to the opposite direction, 
and hence the model is not correct. That is, the mass m should be time-varying m(t) that decreases in time. 
In this example, however, we assume that the mass of the rocket is sufficiently large and the variation can be 
ignored. 
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Let us transform this differential equation into the state equation in (7.1). For this, 
define the state x(t) by 


a Oa FO 
d Be = BA a 


Then we have 

: an r(t) = x2(t) = 01 xı (t) 0 

ah) = Ba ~ Es ~ f 0| A = Ba ro) 
Defining u(t) = F (t) and 


A 0 1 A 0 A či 
sap J epe] elt) s 


we obtain the state equation of the form (7.1). 


(7.5) 


The system (7.3) or (7.5) is sometimes called the double integrator, since the 
position r (t) is obtained by integrating F (t) twice. 

Let us investigate the meaning of the state-space equation (7.1). Assume that 
the initial state (0) = € at time t = 0 is obtained from observation by a 
sensor attached to the system. The signal u(t) is called a control and we design 
u(t) for t > O to realize a desired trajectory of the state a(t). In the rocket 
control considered in Example 7.2, we design the thrust force u(t) = F(t) to 
drive the rocket, for example, within time T > O from the earth (a(0) = &) 
to the moon æ(T) = O with minimum fuel consumption. This is a problem of 
control. 

If the control u(t) for t > O depends only on the initial state æ(0) = &, 
then the control is called feedforward control. Instead, if the control u(t) for 
t > O is determined a constant (or an intermittent) observation of the state 
a(t) with O < t < t, then the control is called feedback control. Feedforward 
control uses only one observation æ(0) at time £ = 0. This is, so to speak, 
driving a bicycle (or a car) with eyes closed, while feedback control uses infor- 
mation from the eyes which is always (or sometimes) open. From this obser- 
vation, we can easily understand that feedforward control is very fragile against 
uncertainties and disturbances. The feedback structure solves this fragility and 
leads to robustness. However, we mainly consider feedforward control since it 
gives clear mathematical structures of the optimal control. For feedback con- 
trol implementation, one can adopt the receding horizon control, also known as 
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Figure 7.2. State transfer problem: finding a control u(t), 0 < t < T that drives the state 
from a given initial state 2(0) = € to 0. 


the model predictive control [70] as discussed in Section 6.2, or self-triggered 
control [48]. 


71.2 Controllability and Controllable Set 


We can consider many types of objectives of controlling the plant (7.1). For exam- 
ple, we set several target points a1,..., Zs to control the plant so that the state 
x(t) pass approximately thorough these points at time t = Ti, ..., Ts, that is 
x(T;) © xi. This control is called trajectory generation, or trajectory planning. We 
can also consider a control problem to keep the state x(t), t > 0 in a prescribed 
set V in the state space, that is, x(t) € ¥ for all t > 0, assuming 7(0) € X. This 
problem arises for example in keeping a drone hovering in a region. 

In this book, we mainly focus on the problem of state transfer. This problem is 
finding a control u(t) that drives the state x(t) from a given initial state € to the 
origin O in a given time T > 0 (see also Figure 7.2). 

First, we discuss the existence of the control. For this, we introduce the notion 


of controllability. 


Definition 7.3 (Controllability). We call the system (7.1) is controllable ¿if for any 
initial state x(0) = € € R®, there exist atime T > 0 and acontrolu(t),0<t < T 
such that the state x(t) in (7.1) is driven to the origin at timet = T, thatisx(T) = 0. 


If the system is not controllable, then there exists an initial state that cannot be 
achieved to the origin with any u(t) in finite time. The controllability is a funda- 
mental requirement for control systems, and in this book we always assume that 
the system (7.1) is controllable. 

Given a linear system, to check its controllability is an easy task. In fact, we have 
the following theorem for the controllability: 
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Theorem 7.4. The dynamical system (7.1) is controllable if and only if any of the 
following equivalent conditions is satisfied: 


1. The following matrix called the controllability matrix 
M=[b Ab A%b ... A'd] (7.7) 


is non-singular. 
2. The following matrix called the controllability grammian 


T 
G(T) ê | e^'bbl e^ at (7.8) 
0 
is non-singular for any T > Q. 
3. Foranyi €C, 


rank[A—AI B|=d. (7.9) 
4. For any left eigenvector v| of A, 
v'b#0. (7.10) 


From this theorem, to check the controllability of the dynamical system (7.1) is 
just to compute the determinant of the matrix M. 

The controllability of the system (7.1) is completely determined by the matrix 
pair (A, b). From this, we often say the pair (A, b) is controllable, which means the 
system (7.1) is controllable. 


Example 7.5. Let us consider the rocket model (7.5) and (7.6) in Example 7.2. The 
controllability matrix is given by 


M = [b a= ‘lt (7.11) 


It is easily shown that this matrix is non-singular. Thus, the system is controllable from 
Theorem 7.4. 


Note that if the dynamical system (7.1) is controllable, then for any initial state 
£ €e Rf, any final state € € Rf, and any time T > 0, there exist a control u(t), 
O < t < T that drives the state x(t) from æ (0) = € to x(T) = C. 


Exercise 7.6. Prove the above fact. 


In general, the shorter the time T > 0 is, the larger the magnitude and the 
shorter the support of u(t) should be. The shape of u(t) may approach to some- 
thing like the Dirac’s delta when T approaches to zero. However, in real systems, 
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the actuator cannot generate arbitrarily large magnitude of control, and there is 
always a limit on the maximum magnitude (can you make a vehicle that moves at 
1000km/h?). Hence, we assume the following limitation on u(t): 


lu(t)| <1, Wt € [0,7]. (7.12) 


We call a control that satisfies this constraint an admissible control. In (7.12), we 
assume the maximum magnitude is normalized to one, but if the maximum mag- 
nitude is Umax > 0 and the limitation is represented by 


|u(t)| < Umax, Vt € [0, T], (7.13) 


then, we can redefine the vector 6 in the plant (7.1) as 


pat (7.14) 
Umax 
then the limitation is reduced to (7.13). 
Under the constraint (7.12), there may be an initial state € that cannot be steered 
to the origin by any admissible control u(t) that satisfies (7.12) within time T > 0 
even if the system is controllable. To discuss this, we introduce the notion of the 
T-controllable set: 


Definition 7.7 (T-Controllable Set). Fix T > 0. The set of initial states that can 
be steered to the origin by some admissible control u(t), 0 < t < T is called the 
T-controllable set. We denote this set by R(T). 


Exercise 7.8. Prove that R(T) can be represented by 
T 
R(T) = {- e^ bu(t)dt : |u(t)| < 1, Vt € [0, r| ' (7.15) 
0 


For the T-controllable set, we have the following theorem: 


Theorem 7.9. For any T > 0, the T -controllable set R(T) is a bounded, closed, and 
convex set. Also, if Ti < Tz then R(T) C R(T2). 


Exercise 7.10. Prove Theorem 7.9. 


Figure 7.3 shows an illustration of a T-controllable set R(T) in R?. If an initial 
state £ (0) = € is in the T-controllable set R(T), then there exists an admissible 
control u(t), 0 < t < T that steers the state to (T) = O in time T. If an initial 
state is outside the set R(T), then such control does not exist. We show an easy 
example to illustrate this property. 


Example 7.11. Let us consider control of a ball on an inclined plane shown in 
Figure 7.4. Let x(t) denote the position of the ball on the x axis parallel to the slope. 
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Figure 7.3. T-Controllable set R(T) in R?. 


Figure 7.4. A ball on an inclined plane: m is the mass, g is the acceleration of gravity, @ is 
the angle of the slope, and F(t) is the force (i.e., control) applied to the ball. 


The origin is set at the top of the slope. The control objective is to move the ball from the 
initial position x(O) = č to the origin within time T > Q, that is, x(T) = 0. 
The differential equation of x(t) is given from Newtons second law of motion: 


mx(t) = F(t) — mg sin@. (7.16) 
Now, we assume 
x(0) = —čğ, x(0)=0, (7.17) 
where € > 0. From (7.16), we have 
x(t) = =f F(t)dt — gt sin 0 + x(0), (7.18) 
and 


x(t) = ~/ F(t)dtds — see sin + x(0)t + x(0). (7.19) 
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Then, if there exists an admissible control {F (t) :0 < t < T} such that x(T) = 0, 
then we have 


1 T sS 1 
o=x()= >| f F@drds- Set? sind —¢, (7.20) 


where we used the initial conditions in (7.17). From the above equation, we have 


< En |F (t)|dtds 


T S 
<f J | F lloodtds (7.21) 
JO JO 


72 
= —||Flloo, 
z WF loo 


1 
pet? sind +č 


where || F llo is the L® norm of function F defined by 


|Fllo = sup |F()I. (7.22) 
te[0,T] 


Since the variables m, g, T, and č are all positive, and sin@ is also positive, 
we have 


a 
iiss Sone sind + a (7.23) 


On the other hand, since F is an admissible control, it should satisfy 


IF loo < 1. (7.24) 
From (7.23) and (7.24), we have 
2 
1 > mgsinĝ + es (7.25) 
It follows that mg sin@ < 1 and 
2 
p> |—me _ are, (7.26) 
1 — mg sin 


From this, if the final time T is small such that T < T*, then there is no admissi- 
ble control with time T that achieves x(T) = 0. Conversely let T = T* and take 
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F(t)=1,0 <t < T*. Then from (7.19), we can easily compute that 
x(T*)=0. (7.27) 


Hence, F(t) = 1 is a solution with T = T*. Also, ifT > T*, then if we choose 
F(t) as 


1, f0<t<T*, 
F(t) = , ; (7.28) 
mgsinð, if T*<t<T, 


then you can easily show that this is a solution with time T. 


From this example, the time T* is the threshold that determines the 
T-controllability with a fixed initial state. The time T* is called the minimum time, 


which is in general defined by 
T*(€) £ inf{T > 0: € € R(T)}. (7.29) 


To consider the minimum time, we define the controllable set by the union of all 
R(T) with T > 0, that is, 


R £ UrsoR(T). (7.30) 


Even if the system is controllable, the controllable set R. may not be R4. That 
is, R may be a strict subset of R. Then, if € ¢ R, then there exists no admissible 
control on any finite time interval [0, T] that achieves (T) = O. For this case, 
we write T* (€) = oo. Conversely, if € € R then the set {T > 0: € € R(T)} is 
non-empty, and the minimum time (7.29) is finite, that is, T*(€) < 00. 

Assume that E € R. Then T* (£) < oo. Let us consider Tj and T such that 


Ti < T*() < T. (7.31) 
Then we have 
RTO C R(IT*(€)) CRID) CR. (7.32) 


This inclusion relation is shown in Figure 7.5. From this figure, we have € € (72) 
and € ¢ R(T1). This means that if the final time is greater than T* (€), then there 
exists a feasible control, while if it is less than T* (€), then there is no control. The 
minimum time 7T*(€) is the threshold for the controllability, that is, € is on the 
boundary of the T* (€)-controllability set R(T *(€)). We will discuss the minimum 
time further in the next subsection. 

For a stable system, the minimum time always exists for any initial state. In fact, 
we have the following theorem [49, Theorem 17.6]: 
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Figure 7.5. Controllable sets R(T) C R(T*(€)) C R(Dh) C R, where Ti < T*(€) < Th. 
The minimum time T*(¢) is the threshold for the feasibility from the initial state, £. 


Theorem 7.12. Assume that (A, b) is controllable. Assume also that A is stable, 
that is, 


A(A) C C_ S {zeC: Rez < 0}, (7.33) 


where (A) is the set of eigenvalues of A. Then the controllable set R is RÊ, and the 
minimum time T* (£) is finite for any € € R¢, 
71.3 Feasible Control and Minimum-time Control 


Fix T > 0 and assume z (0) = € € R(T). Then by the definition of controllable 
set in Definition 7.7, there exists an admissible control u(t) that steers the state 
from «(0) to (T) = 0. We call such a control a feasible control. Let denote by 
U(T, €) the set of feasible controls with initial state € and final time T. This set 
can be represented by 


T 
U(T, €) =u e L” (0,T): ¿E= -| e™^bu(t)dt, |u(t)| < 1, Yt € [0, ri). 
0 
(7.34) 


Exercise 7.13. Prove that the set of feasible controls is represented by (7.34). 
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It is easily shown that € € R(T) if and only if there exists an admissible control u 
such thatu € U(T, £). Hence the minimum time T* (£) in (7.29) is formulated by 


T*(€) = inf{T > 0 : 3u, u € U(T, &)}. (7.35) 


From the discussion in the previous subsection, T* (€) is finite ifand only if € € R. 
From this, if € € R, then there exists a finial time T > 0 and an admissible control 
u such that u € U(T,&), and hence the set of the right hand side of (7.35) is 
non-empty. 

Now we find the control that achieves this minimum time. That is, we consider 
the following optimization problem: 


minimize T subject to u € U(T, &). (7.36) 
u 
The solution is called the minimum time control or the time-optimal control. The 


minimum control exists if € € R or equivalently T* (€) < oo. In fact, we have the 
following lemma: 


Theorem 7.14. Assume T*(€) < œ. Then there exists a minimum-time control 
u* EU(T*(€), €). Moreover, for any T > T* (€), UCT, €) is non-empty. 


Exercise 7.15. Prove Theorem 7.14. 


71.4 Optimal Control and Pontryagin Minimum Principle 


From Theorem 7.14, if E € Rand T > T*(€), then the set of feasible controls 
U(T, €) is non-empty, and in general U/(T, €) has infinitely many elements. Opti- 


mal control is the optimal one with a given cost function among all feasible controls 
in U(T, €). 
The following is the formulation of the optimal control that is mainly considered 


in this book: 


Optimal Control Problem (OPT) 


For the plant modeled by 
x(t) = Ax(t) + bult), t>0, 2x(0)=€eR’, 
find an admissible control u (i.e. ||ulloo < 1) that achieves 


x(T)= 0, 


and minimizes the following cost function: 


T 
J(u) = | L(u(t))dt. 
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We here assume that the function L (u), called the stage cost function, is continuous 
in u. We call the solution the optimal control. Note that the optimal control problem 
can also be written by using the set of feasible controls U (T, €) as 


minimize J(u) subject to u €U(T, €). (7.37) 


The minimum-time control (7.36) is the optimal control with L(u) = 1. 

Let us assume that there exists an optimal control for the problem (OPT). We 
here introduce Pontryagin’s minimum principle that gives necessary conditions for 
the optimal control. 

First, define the following function called Hamiltonian: 


H"(x,p,u) Ê p' (Ax + bu) + nL(u), (7.38) 


where y € {0, 1} is called the abnormal multiplier. The following theorem is Pon- 
tryagin’s minimum principle. 


Theorem 7.16 (Pontryagin’s Minimum Principle (PMP)). Assume that an opti- 
mal control u* of the optimal control problem (OPT) exists. Let us denote by {x* (t) : 
0 < t < T} the optimal state with the optimal control {u* (t) : 0 < t < T}, that is,’ 


t 
on eeng | e^) bu*(t)dt, Vt € [0, T]. (7.39) 
0 


Then there exist n € {0, 1} and the optimal costate {p* (t) : 0 < t < T} that satisfy 
the following conditions. 


(non-triviality condition) The abnormal multiplier n and the optimal costate p* 
satisfy the non-triviality condition: 


In| + lp“ llo > 0. (7.40) 


(canonical equation) The following canonical equations hold 
a*(t) = Ax*(t) + bu* (t), 
(7.41) 
p(t) =—A'p*(), Vt € [0,7]. 


The differential equation for p* (t) is called the adjoint equation. 


2. See the solution formula (7.2) for the differential equation & = Aæ + bu. 
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(minimum condition) The optimal control u* (t) minimizes Hamiltonian at each 
time t € [0, T]. That is, 


u*(t) = arg min H” (x* (t), p* (©), u), Vt e [0, T]. (7.42) 


luļ<1 
(consistency) Hamiltonian satisfies 
H” (1*0), p ©, u* (6) =c, Vt € [0,T], (7.43) 


where c is a constant independent oft. IFT is not fixed (as in the minimum-time 
control), then 


H"(x*(t), p*(t),u*(t)) =0, Vt € [0, T]. (7.44) 


Note that the canonical equation in (7.41) can be rewritten in terms of Hamil- 
tonian H” as 


oH" 
tO = —— (x), p*(), u" (0) 
p (7.45) 


- oH" 
t) = -———_ 
p(t) as 


(a*(t), p*(t),u*(t)), Vt € [0,7]. 
These equations are also called Hamilton’ canonical equations. 

Pontryagin’s minimum principle is a powerful tool to analyze the optimal con- 
trol (if it exists). For simple problems, we can obtain a closed form of the control 
that satisfies the necessary conditions. We call this an extremal control. We should 
note that an extremal control is not necessarily the optimal control. However, in 
some cases, we can determine the optimal control from the minimum principle. 
One example is shown in Section 7.3, the minimum-time control for the rocket 
in Example 7.2. Before the example, we will formulate the minimum-time control 
for general linear systems. 


7.2  Minimum-time Control 


Let us consider the following linear system: 
a(t) = Aw(t)+bu(t), t>0, «(0)=€eR?. (7.46) 


For this system, we consider the minimum-time control, which is given by the 
optimal control (OPT) with the stage cost L(u) = 1. Then the Hamiltonian is 
given by 


H” (x, p, u) = p' (Ax + bu) +n. (7.47) 


Minimum-time Control 139 


Let us assume that the minimum-time control exists. Then from Pontryagin’s min- 
imum principle, the optimal control u* (t) should satisfy 


u* (t) = arg min H” (æ* (t), p* (t) u), Vt € [0, T*(&)], (7.48) 
ue[—1,1] 


where x*(t) and p* (t) are respectively the optimal state and costate by the optimal 
control u*(t), and T*(&) is the minimum time. From this, we have 


u*(t) = arg min p*(t)' bu = —sen(p* (t)'b), (7.49) 
ue[—1,1] 


where sgn(-) is the sign function defined by 


1, a>0O 
sgn(a) = 
-l, a<0O (7.50) 


sgn(a) €[-1, 1], a=0. 


If the function p* (t)! b is not zero for almost all t € [0, T*(€)], then the control 
u*(t) takes values of only +1 for almost all t. Such a control is called a bang-bang 
control. 


Lemma 7.17. If (A, b) is controllable, then the function p*(t)'b is not zero for 
almost allt € [0, T*(€)]. 


Exercise 7.18. Prove Lemma 7.17. 


For the minimum-time control problem, we have the following existence and 
uniqueness theorems. 


Theorem 7.19 (Existence). Jf the initial state € is in the controllable set R defined 
in (7.30), then a minimum-time control exists. 


Theorem 7.20 (Uniqueness). Assume that (A,b) is controllable. Then the 
minimum-time control is (if it exists) unique. 


Exercise 7.21. Prove Theorems 7.19 and 7.20. 
The following corollary is easily proved from Theorems 7.12, 7.19, and 7.20. 


Corollary 7.22. Assume that (A, b) is controllable and A is stable. Then for any 
€ e R, the minimum-time control u* € U(E) uniquely exists. 
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7.3 Minimum-time Control of Rocket 


Here we derive the minimum-time control of the rocket in Example 7.2 by using 
Pontryagin’s minimum principle. 

Now, from Example 7.5, the pair (A, b) is controllable. It is also easily seen that 
A is stable since A has a multiple eigenvalue of 0. Therefore, from Theorem 7.22, 
there uniquely exists the minimum-time control u*. 

Let us define the optimal state and costate by 


* xi (t) x at 
z*(t) =|" 1 ; 2 ie f (7.51) 
o=o = [RG 

For simplicity, we assume the mass of the rocket m = 1. 
Then the Hamiltonian H” (æ, p, u) in (7.47) is given by 
H"(a, p,u) = p| (Aw + bu) +9 = pixr + puu + 1. (7.52) 


The canonical equation (7.41) for the costate p*(t) is given by 


Pi (t) =9, 


7.53 
PO = -pi ©. ee) 


Let pj (0) = zı and p3(0) = x2. Then the solution to the differential equation 
(7.53) is given by 


pit) =m, 


7.54 
p(t) = m2 — Tt. ( 


Since T is not fixed, from the condition (7.44), we have H"(a* (t), p* (t), u*) = 0. 
That is, 


PIHE) + pa(thu* +4 =0. (7.55) 


If zı = 22 = 0, then pi (t) = p3(t) = 0 from (7.54), and hence 7 = 0 from 
(7.55). But this contradicts the non-triviality condition (7.40). Therefore, mı 4 0 
or 22 Æ O, that is p* (0) Æ 0. 

Next, from (7.49), we have 


u*(t) = —sgn(p* t)" b) = —sgn(p3 (t)). (7.56) 


From (7.54), p(t) is a linear function p5(t) = 22 — zıt. Then we need to 
check the following cases: 
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k u*(t) m 
E ir cme 
0 0 
=] 
ee ut (t) 
(i) (ii) 
i i u* (t) 
T2 t 0 polt) t 
0 p> (t) T2 
=i =] 
u* (t) 
(iii) (iv) 


Figure 7.6. Costate p3(t) = x2 — zıt and corresponding extremum control u* (t) from (7.56). 


(i) mı < 0, z2 < O with (z1, 22) # (0, 0) 
(ii) z1 > 0, z2 > 0 with (21, 22) Æ (0, 0) 
(iti) 21 <0, m2>0 
(iv) 21 > 0,22 <0 


Extremum controls given in (7.56) for the 4 cases are shown in Figure 7.6. From 
this figure, it is easily shown that each extremum control takes its values of +1 for 
almost all ż, that is bang-bang. We note that the number of switching is at most 
one from Figure 7.6. 

Next, we compute the trajectory 2(t) when u(t) is constant (i.e. +1). From 


(7.5), we have 


xit) = x2(t), xı(0)=&, 


. (7.57) 
X(t) =u(t), x2(0) =. 


If u(t) = +1, then 


1 
xi(t) = +50 + t+, 
eat) = t + &2. 


(7.58) 


Eliminating the time variable t gives 


1 1 
x(t) = tzr) +4 F 52: (7.59) 


142 Dynamical Systems and Optimal Control 


Figure 7.7. Flow of state (xı(t), x2(t)) by constant control u(t) = 1 (solid curve) and 
u(t) = —1 (dashed curve). 


That is, when the control u(t) is constant +1, then the state (x1 (t), x2(t)) moves 
on the following parabolic curve: 


1 1 
u= itä- ifu@=1. (7.60) 
pee pE e if u(t) = —1. (7.61) 


Figure 7.7 shows the flow of the state (x; (t), x2(t)) by some of these parabolic 
curves with directions of the state to move. Note that the parabolic curves defined 
in (7.60) and (7.61) go through the point (1, 2). 

To achieve the terminal state æ(T) = 0, the final trajectory must on the 
parabolic curve that go through the origin: 


1 
xj = 5*2» if u(t) = 1, 


I (7.62) 
x, = -572 if u(t) = —1. 
From this, if there is no switching, that is, in the cases of (see Figure 7.7) 
(i) u*(t) =1 
(ii) u*(t) = —1 
then, the initial state (€1, €2) should be on the parabolic curve 
y4 £ {(x1, x2) e R°: xı = x$ /2, x2 < Oo} (7.63) 


or 


y- Ê {@1, x2) € R? : x1 = —x3/2, x2 < O}. (7.64) 
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Figure 7.9. 4 cases of state trajectories from initial state (€,&) on the curve y4. 


Figure 7.8 shows the two curves y+ and y_. In fact, we can easily show that 


e If the initial state (C1, €2) is on the curve y+, then u*(t) = 1 is the unique 
extremum control. 

e Ifthe initial state (C1, 2) is on the curve y_, then u*(t) = —1 is the unique 
extremum control. 


The proof is shown below. 

Assume (¢1, č2) € y4. As mentioned above, there are 4 extremum controls 
with (i)—(iv). Now, u*(t) = 1 is for the case (i). The point A in Figure 7.9 is the 
initial point, and the state can reach the origin by u* (t) = 1 through the curve y+. 
However, for the other cases (ii), (iii), and (iv), the state never reaches the origin 
from the initial point A. For the case (ii), by the control u* (t) = —1, the state starts 
at A on the curve y^ to the direction to C, and never reaches the origin. For the 
case (iii), the state moves on the curve y/ from A to C by the control u*(t) = —1, 
which is switched to u*(t) = +1 at C. Then the state moves on the curve y$, 
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Figure 7.10. State trajectories from initial points A and A’. 


which never reaches the origin. Finally, for the case (iv), the state starts from A to B 
on the curve y4 by the control u* (t) = +1, which is switched to u* (t) = —1 at B. 
The state then moves from B on the curve y” to the indicated direction and never 
reaches the origin. In summary, (i) u* (t) = 1 is the unique extremum control, and 
hence if the minimum-time control exists, this is actually the optimal control. The 
same discussion can be applied for the initial state (1, €2) on the curve y_, and 
the unique extremum control is u* (t) = —1. 
Next, let us consider the initial state (€,, €) is outside the curve 


y & y Uy- = {(x1, x2) € R? : x1 = —x2|x2|/2}. (7.65) 
Let us define two regions Ry and R_ divided by the curve y : 


Rs Ê {@1, x2) € R? : x1 < —x2|x21/2}, 


i (7.66) 
R- Ê {@1, x2) € R? : x1 > —x2|x21/2}. 


Figure 7.8 shows these regions. We call the curve y the switching curve. 

Now assume the initial state (¢1, €2) is at A in Ry as in Figure 7.10. From the 
point A, the curve y4 defined in (7.60) is plotted. By a constant control u(t) = 1, 
the state moves on the curve y{ to the indicated direction from A. At some time, the 
state touches the switching curve y— at B, and the control is switched to u(t) = —1. 
From B, the state goes on the switching curve to the origin. The control switched 
from +1 to —1 is the control in (iii) in Figure 7.6. We can easily show that this is 
the unique extremum control from any initial point in R+, in a similar way to the 
case where the initial state is on the curve y = yy U y—. 

Then, let us consider the initial state (€1, €2) € R— at A’ in Figure 7.10. First, 
by the constant control u(t) = —1, the state moves on the curve y^ from A’ to PB’. 
Then the control is switched from —1 to +1, and the state is steered to the origin 
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along the curve y+. This control is for the case (iv) in Figure 7.6, and actually this 
is the unique extremum solution. 

In summary, the extremum control u* (t) of the minimum-time control problem 
is given by 


l, if x(t) € y+ U Ry \ {0}, 
u*(t)= 4—1, if æ(t)ey-UR-\{0}, (7.67) 
0, if x(t) = 0. 


The control u*(t) depends on the state a(t), and hence the control is a feedback 
control, which changes its value (+1 or 0) based on the observation of the state x(t). 


Exercise 7.23. Compute the minimum time T*(€) from € = (€1, 2) to the 
origin, and the switching time when € € Ry and Ẹ € R_. 


7.4 Further Reading 


For the basics of control theory with state-space formulations, I recommend a nice 
book by Ogata [89]. For the controllable set, see [103]. The proof of Pontrya- 
gins minimum principle is found in [66]. Pontryagin’s minimum principle is also 
referred to as Pontryagin’s maximum principle, which is mathematically equivalent 
to the minimum principle. The book by Clarke [24] is one of the most reliable 
books on the Pontryagin’s maximum principle. For the minimum-time control, see 


the classical books of [2, 49, 96]. 
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Chapter 8 


Maximum Hands-off Control 


In this chapter, we introduce a new optimal control problem called maximum 
hands-off control, which is the sparsest control among all feasible controls. 


Key ideas of Chapter 8 


e Maximum hands-off control is described as L°-optimal control. 
e Under the assumption of non-singularity, L°-optimal control is equiva- 


lent to L!-optimal control. 


e Maximum hands-off control is a ternary signal that takes values of +1 


and 0. Such a ternary control is called a bang-off-bang control. 


8.1 L? Norm and Sparsity 


Here we introduce mathematical preliminaries for maximum hands-off control. 
Define the support of a function u(t) on a finite interval [0, T] by 


supp(u) = {t € [0, T] : u(t) 4 0}. (8.1) 


By using the support, we define the L? norm by the length of the support of func- 
tion u, that is, 


lullo = «(supp(w)), (8.2) 


146 


L? Norm and Sparsity 147 


u(t) 


Figure 8.1. The L° norm of the function u(t) is ti + (T — b). 


where u (S) is the Lebesgue measure of a subset $ C [0, T]. From this definition, 
the L° norm of a continuous-time signal is the total length of time duration on 
which the signal takes nonzero values. 


Example 8.1. Let us consider a function u(t) in Figure 8.1. The function u(t) is zero 
over the interval |t, t2], and the support set ofu is 


supp(u) = (0, t1) U (2, T) C [0, T]. (8.3) 
From this, the L? norm ofu is 
lulo = “(supp@)) = ti + (T —h) =T -()- t1). (8.4) 


In the above example, the value t2 — ty is the length of the interval [t1, t2] on which 
u(t) = 0. If ||u||o is much smaller than the total length T (ie., ||u|lo « T), then 
the signal is said to be sparse. This notion is an analogy of the sparsity of vectors 
studied in Part I of this book. 


Define 
0, ifu=O, 
lui? 4 n (8.5) 
1, ifu#0, 
then the L? norm in (8.2) can be written as 
T 
lullo = i |u(t)|°dt. (8.6) 


Note that the L° norm does not have the absolute homogeneous property (see 
Definition 2.7, p. 20). In fact, if we take a non-zero scalar a such that |a| 4 1, 
then 


laullo = lullo F lalllullo. (8.7) 
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Note also that a sparse signal u(t) on [0, T] has a time duration whose length is 
positive, on which the control u(t) is exactly zero. This means that the function 
u(t) is not a real analytic function.' For example, a polynomial function 


p(t) =t” ta,_yt” | 4+---+ ait +40, (8.8) 


a trigonometric function sin(@t) (w # 0), an exponential function e”, and their 
sum or product are never sparse. 


8.2 Practical Benefits of Sparsity in Control 


Let us consider the sparse control signal u(t), t € [0, T] shown in Figure 8.1. This 
control signal is exactly zero on the time interval [t1, t2]. In an electromechanical 
system, the control signal is transformed into a mechanical motion by an actuator. 
An electric motor is an example, which transforms the control signal given as an 
electric current into torque. Usually, an amplifier is attached between a controller 
and an actuator to supply energy to the actuator enough to generate a mechanical 
motion. Hence for actuation, we need not only a control signal but also enough 
energy. 

By using a sparse signal as in Figure 8.1, we can stop energy supply to the actu- 
ator over the time interval [t), t2]. That is, we can save consumption of electric 
power or fuel over this interval. We call such a control a hands-off control, which 
is also known as gliding or coasting. This control strategy is actually used in prac- 
tical control systems. A stop-start system [34, 64] in automobiles is an example 
of hands-off control. It automatically shuts down the engine to avoid it idling for 
a long duration of time. By this, we can reduce CO or CO2 emissions as well as 
fuel consumption. Also in hybrid vehicles [17, 87, 105], the internal combustion 
engine is stopped when the vehicle is at a stop or the speed is lower than a preset 
threshold, and the electric motor is alternatively used. Other examples are found 
in railway vehicles [18, 63] and free-flying robots [113]. Hands-off control is also 
desirable for networked and embedded systems since the communication can be 
stopped during a period of zero-valued control. This property is advantageous in 
particular for wireless communications [60]. By these properties hands-off control 
is also known as green control. 


1. A function u(t) is said to be real analytic if it is an infinitely differentiable function such that the Taylor series 
at any point fo € (0, T) converges to u(t) for t in a neighborhood of to pointwise. See [98, Chapter 8] for 
details. 
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8.3 Problem Formulation of Maximum Hands-off 
Control 


Let us consider the following linear time-invariant system: 
x(t) = Aa(t)+bu(t), t>0, 2(0)=€eRY%, (8.9) 


For this system, we consider the optimal control problem (OPT) (p. 136) with the 
stage cost function 


L(u) = |u°. (8.10) 


Namely, we seek the optimal control that minimizes the L° cost function 


T 
Jou) Ê lullo = | lw (t)ledr, (8.11) 


among all feasible controls. We call this problem an L°-optimal control problem or 
a maximum hands-off control problem. 


L°-optimal control problem (L° OPT) 


For the linear time-invariant system 
x(t) = Ax (t) + bult), t>0, wx(0)=eeR%, 
find a control {u (t) : t € [0, T]} with T > O that minimizes 
T 
D) = Illa = f Moar 
subject to 


x(T)=0, 


lullo < 1. 


We call the solution of this optimal control problem the L°-optimal control, or the 


maximum hands-off control. 
The stage cost function (8.10) is discontinuous and non-convex, as shown in 
Figure 8.2. By borrowing the idea of sparse representation to use the £! norm for 
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=1 0 1 


Figure 8.2. Stage cost functions |u|? and |u| in L? and L! optimal control problems. 


£? norm optimization, we introduce the following cost function with the L! norm: 


T 
napuh = f Old (8.12) 


As shown in Figure 8.2, the stage cost function |u| is an approximation of |u|°. 
In fact, this approximation is mathematically explained as the convex relaxation. 
That is, the L! norm |u|], is the convex relaxation of ||u||g when ||u|loo < 1. See 
[112, Section 1.3.2] for details. 

Now we formulate the L!-optimal control problem. 


L'-optimal control problem (L'-OPT) 


For the linear time-invariant system 


a(t) = Aa(t)+bu(t), t>0, 2(0)=€eR’, 


find a control {u(t) : t € [0, T]} with T > O that minimizes 


T 
n= thal =f ud 
subject to 


x(T)= 0, 


lullo < 1. 


We call the solution of this optimal control problem the L!-optimal control. This 
optimal control is also known as minimum fuel control, which was widely studied 
in 60s for rocket control. 
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The L!-optimal control (L!-OPT) is a convex optimization problem since the 
stage cost L(u) = |u| is convex in u and the constraints are also convex. Although 
the variable u is a function, which is a member of infinite dimensional function 
space L% (0, T), the problem can be reduced to a finite-dimensional optimization 
problem via time discretization studied in Chapter 9. 


8.4 L!-optimal Control 


Here we investigate properties of L! optimal control by using necessary conditions 
from Pontryagin’s minimum principle. 
For the L!-optimal control problem (L!-OPT), the Hamiltonian is given by 


H” (æ, p, u) = p' (Ax + bu) + nlu]. (8.13) 


We first consider the case y = 1 (the normal case). Let u* denote the L! optimal 
control, and æ* and p* the associated optimal state and costate, respectively. From 
the minimum condition in the minimum principle, we have 


u*(t) = arg min H! (x* (©), p*(t), u) 


luļ<1 


= arg min{p*(1)" (Ax* (£) + bu) + lul} (8.14) 


Jul<1 


= arg min{p*(t) ' bu + |u|} 


luļ<1 
Now from 


p*(t)'b+1)u, ifu>0 
p*(t)' bu + |u| = l ) (8.15) 
(p*(t)'b-1)u, ifu<0 

we have the solution to the minimization problem in (8.14) as 

u* (t) = —dez(p*(t)'b), (8.16) 
where dez(-) is called the dead-zone function defined by 

-1l, ifw<-l 
dez(w) = } 0, if -l<w<l 


1, if 1 <w 
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dez(w) 


Figure 8.3. Dead-zone function dez(w). 


dez(w) € [-1,0], if w = —1 
(8.17) 
dez(w) € [0,1], if w=1 


Figure 8.3 shows the graph of the dead-zone function. 
Exercise 8.2. Show that (8.16) is the solution to the minimization problem (8.14). 


If there is a time interval (t1, f2) on which p*(t)'b = 1 holds, then from 
(8.17), we cannot uniquely determine u* (t) on this interval. We call such a time 
interval a singular interval. If an L'-optimal control problem has a singular interval 


whose length is positive, then we call the problem a singular problem. On the other 
hand, if 


u({t € [0, T]: |p*(t)"b| = 1}) =0 (8.18) 


holds, then the L!-optimal control problem is said to be non-singular. The follow- 
ing lemma gives a sufficient condition for the non-singularity. 


Lemma 8.3. /f(A, b) in (8.9) is controllable and A is nonsingular, then (8.18) holds 
(i.e., the L'-optimal control problem is non-singular). 


From now on, we say (A, b) is non-singular if (A, b) is controllable and A is non- 
singular. 


Exercise 8.4. Prove Lemma 8.3. 
From Lemma (8.3), if (A, b) is non-singular, then we have 
p’(t)'bA+1,  foralmostall t € [0, T]. (8.19) 


Then, from (8.16) and (8.17), the L!-optimal control takes values +1 or 0 for 
almost all ¢ in [0, T]. We call such a control a bang-off-bang control. Figure 8.4 
illustrates the bang-off-bang property of L!-optimal control. We summarize this 
property as a theorem. 
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u(t) 


Figure 8.4. L!-optimal control (bang-off-bang) u*(t) (top) and function p*(t)'b. 


Theorem 8.5. Assume that (A, b) is non-singular. Then the L'-optimal control is 
bang-off-bang (if it exists). 


The bang-off-bang property is important to examine the relation between L! 
and L° controls as shown in the next section. 


Remark 8.6. The function p* (t)! b is given by 
p(t)" b= (e~4"'p*@)) b = p*(0) "eb, (8.20) 


from the adjoint equation for p* (t) in (7.41). Therefore the function p* (t)' b is con- 
tinuous, and represented by 


d 
AORE > cite“, (8.21) 
i=l 


where Ài is the i-th eigenvalue of A and ci (t) is a polynomial with degrees up to d. It 
follows that the number of switching in u* (t) is finite, and the value changes between 1 
and0 or —1 and 0, and never changes between 1 and —1. Therefore, ifu* (t) switches, 
then there exists a time duration with positive length on which u* (t) = 0. 


Finally, let us consider the case y = 0 (the abnormal case). The Hamiltonian is 
given by 


H? (ax, p, u) = p' (Ax + bu). (8.22) 
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Then the optimal control u* (t) satisfies 


u*(t) = arg min H°(x*(t), p*(t), u) 
ue[—1,1] 


= arg min p* (t)! bu 
ue[—1,1] (8.23) 
—1, if p*(t)'b> 0, 
=4], if p*(t)'b <0, 


[-1,1], if p*(t)'b=0. 


If (A, b) is controllable, then p*(t)'b Æ 0, and hence the control is bang-bang, 
taking values of +1. With this control, the L'-optimal value is 


T 
Jı (u*) =f |u“(t)|dt =T. (8.24) 


Moreover, if T*(€) < T < œ, then there exists the minimum-time control 
* 

time 
the following control: 


u* _(t) to achieve æ(T*(€)) = 0. By using this minimum-time control, define 


i(t) = Ba ifO<t<T*(€), (8.25) 


0, if T*(€) <t <T. 


It is easily shown that ù is a feasible control, that isu € U(T, €). Also, with this ù, 
we have 


T T*(€) 
n= | Odi [ining lat = TO <T =h) 
(8.26) 


Hence, the control u* (t) can never be L! optimal, and hence the case y = 0 never 
happens. 

The abnormal case (y = 0) happens when T = T*(€). In this case, the set 
of feasible controls is U (T* (€), €) = {už ne) a singleton of the minimum-time 
control, and hence the cost function is meaningless to choose a control from the 
feasible set. In this book, we do not discuss the abnormal case any further. 


8.5 Equivalence Between L°’ and L! Optimal Controls 


In this section, we study the equivalence between L° and L! optimal controls. 
The following theorem is a fundamental theorem for the equivalence. 
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Theorem 8.7. Assume that there exists an L'-optimal control that is bang-off-bang. 
Then it is also L? optimal. 


Proof: Define Jo(u) = ||u||o and Jı (u) = ||u||1. From the assumption, there exists 
an L!-optimal control uł that is bang-off-bang. Since uï is a feasible control, the 
set of feasible controls U (T, €) is non-empty. Then, for any u € U(T, €) we have 


T 
Jı(u) = J |u(t)|dt = | |u(t)|dt < 1 ldt = Jolu). (8.27) 
JO J supp(u) supp(u) 
Since uj is bang-off-bang, we have 
T 
Ji (uj) =i jui (t)|dt = ldt = Jo(u}). (8.28) 
0 supp(u7) 


From (8.27) and (8.28), we have 


Jo(u;) < Jou), Vu €U(T, £), (8.29) 


and hence u* minimizes Jo(w). That is, u* is also L? optimal. 

From Theorem 8.5, if (A, b) is non-singular, then the L!-optimal control is 
bang-off-bang, that is, the optimal control u*(t) takes values 0 or +1 for almost 
all t € [0, T]. From this property, we can obtain the following theorem. 


Theorem 8.8. Assume that there exists at least one L'-optimal control. Assume also 
that (A, b) is non-singular. Then there exists at least one L°-optimal control, and the 
set of L°-optimal controls is equivalent to the set of L! -optimal controls. 


Proof: Let Uğ and U¥ be the sets of L? and L! optimal controls, respectively. From 
the assumption, Uf is non-empty. Take uj € Uj arbitrarily. Then, from Theorem 
8.5, ut is bang-off-bang. It follows from Theorem 8.7 that uj € Uj, and hence 
Ur C Up- 

Then we prove Uy C Uy. Take uy € U C U(T, &) arbitrarily. Take also 
uy € UF C U(T, £) independently. From (8.28) and the L! optimality of uj, 
we have 


Jo(uy) = Jı (uï) < Ji(ug)- (8.30) 
On the other hand, from (8.27) and the L? optimality of už, we have 

Jı (u5) < Joug) < Jolu). (8.31) 
From (8.30) and (8.31), we have 


Jo(uy) = Jı (uï) < Jı (u5) < Joug) < Jouï). (8.32) 
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It follows that Jı (uj) = Jı (ug), and uj minimizes Jı (u). That is, we have uj € 
Uj and hence Uy C UF. 


8.6 Existence of L°-Optimal Control 


Here we consider the existence of L°-optimal control. 


8.6.1 L?-Optimal Control 


To consider the existence of L°-optimal control, we introduce the LP -optimal con- 
trol with p € (0, 1). The optimal control problem is described as follows: 


L?-optimal control problem (L?-OPT) 


For the linear time-invariant system 


a(t) = Ax(t) + bult), t>0, 2(0)=€eR%, 


find a control {u (t) : t € [0, T]} with T > O that minimizes 


T 
n= Yul? = ju(t)|Pdt 


with p e (0, 1), subject to 


x(T)= 0, 


lullo < 1. 


First, we prove an interesting relation between the L? norm’ with p € (0, 1) and 


the L? norm. 


Lemma 8.9. Suppose u € L'(0,T). Then u is also in LP (0, T) for any p € (0, 1), 
and 


lim |lull> = lullo. (8.33) 
p—>0+ 


Exercise 8.10. Prove Lemma 8.9. 


2. Strictly speaking, the L? “norm” with p e (0, 1) is not a proper norm since the triangle inequality does not 
always hold. 
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Now, let us look into the L?-optimal control with p € (0, 1). The Hamiltonian 
is given by 


H” (æ, p, u) = p' (Aw + bu) + nlu”. (8.34) 


Let us consider the normal case (y = 1). From the minimum condition in Pon- 
tryagins minimum principle, we have 


u* (t) = arg min H” (x* (t), p* (t), u) 
ue[—1,1] 


= arg min{p*(t) | bu + |u|} 
ue[—1,1] 


—1, if p*(t)'b> 1 
(8.35) 
0, if —l<p*(t)'b<1 
= ł], if p*(t)'b < —1 


{-1,0}, if p*()"b=1 
{0, 1}, if p*(t)'b=-1 


From this, L?-optimal control is always bang-off-bang. We mention this in the 
following theorem. 


Theorem 8.11. The L?-optimal control with p € (0,1) is bang-off-bang (if it 
exists). 


Also, it is shown that the set of L?-optimal control is equivalent to the set of L°- 
optimal controls. 


Theorem 8.12. Assume that there exists at least one LP -optimal control with p € 
(0, 1). LetUg andy, be the sets of L? and LP optimal controls respectively. Then we 
have 


us =U. (8.36) 
Exercise 8.13. Prove Theorem 8.12. 

From Theorems 8.11 and 8.12, we have the following theorem. 

Theorem 8.14. The L°-optimal control is bang-off-bang (if it exists). 


The difference of Theorem 8.12 from Theorem 8.8 for the L!-optimal control 
is that for the LP optimal control we do not need the assumption of the non- 
singularity of (A, b). This is the key to prove the existence of L? optimal control. 
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8.6.2 Existence Theorems 


From Theorem 8.12, if we show the existence of L?-optimal control for some 
p € (0, 1), then Uj is non-empty, and hence there exists at least one L°-optimal 
control. The following theorem is on the existence of L?-optimal control with 
p > 0? 


Theorem 8.15. Suppose that the initial state € € R4 and the time T > 0 are chosen 
such that E € R(T). Then, then there exists an LP -optimal control with p > 0. 


Proof: Assume € € R? and T > 0 satisfy € € R(T). Then there exists a feasible 
control u € U (T, €), and hence the feasible set U (T, €) is non-empty. Define 


J = inf(llullp : u € UCL, 8). (8.37) 


Since u € U (T, €) satisfies ||ulloo < 1, we have J* < oo. Then, from the defini- 


tion of Jg there exists a sequence {ur }en C U (T, €) such that 


lullip = J}. (8.38) 


lim 
l>0o 
Now, since u; € U(T, £), we have 


lullo < 1, (8.39) 


and hence {uj}ien C Boo £ {u € L®(0,T) : llullo < 1}. It is known that 
the unit ball Boo is sequentially compact in the weak* topology of L° (0, T) [74, 
Theorem A.9]. That is, there exists a subsequence {uy }res, S C N, such that there 


exists Uoo € Boo and 


T 
lim | f (t)(ur(t) — Uoo(t))dt = 0, (8.40) 
V0 Jo 
for any f € L'(0, T). Now, since uy € U(T, €), we have 
T 
gs -f e “buy (tdt, Wes. (8.41) 
0 


On the other hand, from (8.40) with f (t) = e74"b, we have 


T T 
lim e “buy (t)dt = f e^ bus (t)dt. (8.42) 
0 


V’>00 Jo 


3. To show the existence of L°-optimal control, we just need to prove the existence of LP -optimal control with 
p € (0, 1). However, Theorem 8.15 gives more general result. 
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That is, 


T 
¿= [ eA buco (t)dt. (8.43) 
JO 


Also since Ugo € Boo, we have ||Uoolloo < 1. Therefore, uoo € U(T, €). 
Next, let us define 


T 
Oy z. uy (t)?sgn(Uoo(t)”)dt. (8.44) 
0 


Similar to (8.40), for the sequence {ul yien C Boo and f(t) = sgn(Woo(t)?), we 
have 


T T 
lim ry = | uoo l(t)” sgn (oo (t)” )dt = J |uoo(t)|?dt = lucoll}. (8.45) 
l'3 00 Jo JO 
On the other hand, we have 
T 
(rl = frat = elf > Jp. (8.46) 
0 
as l’ —> oo, and hence from (8.45), we have 
lollo < Jg (8.47) 
Now, from (8.37), Jp is the minimum value of ||u Ib over U(T, €), and hence 


lluoollb = J}. (8.48) 


That is, uoo € U(T, &) is an L?-optimal control. 
From Theorem 8.15 and Theorem 8.12, we have the following theorem. 


Theorem 8.16 (Existence of L° optimal control). JFE € R(T), then there exists 
an L°-optimal control. 


The condition of € € R(T) is equivalent to € € Rand T > T*(€). Hence, we 
have the following lemma. 


Lemma 8.17. Let u* be the L°-optimal control with € e R(T). Then \|u*|lo < 
T*(€). 


Proof: This is easily shown by considering a feasible control in (8.25). 
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8.7 Maximum Hands-off Control of Rocket 


Here we compute the maximum hands-off control of the rocket considered in 
Example 7.2 (p. 127) in the previous chapter. We assume the mass m = 1 for 
simplicity. 

We now compute the L!-optimal control. From (8.13), the Hamiltonian with 
q = 1 is given by 


0 1 0 
H (x, p, u) =p! (f l r+ H u) + |u| = pıxı + pzu + |u|, (8.49) 


where p = (p1, p2). Let denote by u* the L!-optimal control, and x* = (xf, x3), 
p* = (pj, p3) the associated optimal state and costate, respectively. From (8.16), 
the L!-optimal control u* (t) satisfies 


u* (t) = —dez(p3(t)), (8.50) 


where dez(-) is the dead-zone function defined in (8.17) (see also Figure 8.3). 
Then, the adjoint equation of the costate p* (t) becomes 


T 
a] s o i Ee z l 0 | 
i (t) 0 0 p3(t) —p* al (8.51) 


The solution of this differential equation is given by 
PO =r, P3) =r- Tt, (8.52) 
where 
zı = pý (0), m2 = p3(0). (8.53) 


It follows from (8.52) that if z1 # 0 then p3 (t) is a first-order linear function of 
t and hence p(t) is monotone. Therefore, from (8.50) and the definition of the 
dead-zone function (8.17), switching occurs at most twice, and the value changes 
between —1 and 0, or between 0 and 1. From this observation, the L!-optimal 
control is given as follows (for details, refer to [2, Section 8.5]). 

Define the following regions (see Figure 8.5). 


Y= CEJ eR: x= —x2lx21/2} 


Rı = CEJ e R?: x> —x2/2, x2 > o) 


R, = CEJ c R? :X1 < —x2/2, X2 > o} 
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Figure 8.5. Curve y (thick solid line) and regions Rj, R2, R3, and R4. 


R3 = CEJ e R°? : xı <x3/2, x2 < o} 
(8.54) 
R4 = CEJ eR? :x,> ino. x2 < o} 
Also, define the following two regions: 
v= CEJ eR: —x2/2 —x1/x2 > r} 
(8.55) 
V} = [œ x2) e R? : x2/2 — x1/x2 > | 


Then, the L!-optimal control is given as follows: 
1. If (C1, G2) € Ry or (1, &2) € R4M V_, then the optimal control is given by 


-l, ifO<t<t 
u*(t) = +0, ft <t <fo (8.56) 
1, ift2<t<T 


where 


T+- -6-46 - 28 


ti = ` 
2 (8.57) 
T+Ot+ f(T -6-4 - 28 
b= : 
2 


2. If (1, &2) € R3 or (č1, €2) € Ro N Vy then the optimal control is given by 


1, ifO<t <f3 
u*(t) = 40, ifR<t<t (8.58) 
-1l, ift <t<T 


162 Maximum Hands-off Control 


optimal control 


—' optimal 
== optimal 


time (sec) 


Figure 8.6. L!-optimal control (solid line) and L?-optimal control (dashed line). 


where 


T—-4-J/(T+ 4) +46 -22 


t3 = 7 
(8.59) 
T-O+/T +4) +44 -28 
t4 = x 
2 
3. If (č1, č2) € y then the optimal control is given by 
— , ifO<t 
po- [TEM fos <k en 
0, if bce ace 


4. If (G1, &2) € RaN(V_)* or (G1, &2) € RoM(V_)°,* then the control problem 
is singular, and the optimal control cannot be uniquely determined from the 
minimum principle. 


Figure 8.6 shows the L!-optimal control with the final time T = 5 and the 
initial state (C1, €2) = (1,1) €e R}. Figure 8.7 shows the associated optimal state 
trajectory {(x] (t), x3 (t)) : 0 < t < 5}. In these figures, we also show the results 
of L?-optimal control that minimizes the L? cost function 


T 
l= f weorar, (8.61) 


among the feasible controls. From Figure 8.6, we can see that the L!-optimal con- 
trol is sparse while the L?-optimal control is not. In fact, the L!-optimal control 


4. (-)© denotes the complement set. 
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state-space trajectory 
i —! optimal | 
- -? optimal 


0.5 


x2 


-0.5 } ~ Tas 


Oo a a a Fas, et) J 


xi 


Figure 8.7. Optimal state trajectory (xī (t), x3(¢)): L!-optimal control (solid line) and 
L?-optimal control (dashed line). 


is bang-off-bang, and hence this is equivalent to L°-optimal control. That is, the 
L}-optimal control has the maximum length of time duration on which the control 
is exactly zero. From (8.57), this time length is given by 


[t1, t2] = [3 — V10/2, 3 + V/10/2] ~ [1.4189, 4.5811]. (8.62) 


and the L? norm of the L}-optimal control u* is ||u* llo = V10 ~ 3.1623. On this 
time duration, the state trajectory (xj (t), x3 (t)) is parallel to the xı axes. Since x1 
is the portion and x2 is the velocity of the rocket, this state trajectory means that 
the rocket moves at a constant velocity. The rocket consumes no fuel on this time 
duration, and hence we can cut fuel consumptions and also we can reduce CO2 
emissions, etc. That is, the control is green. It is clear that L?-optimal control does 
not have such a nice property of sparsity. 


8.8 Further Reading 


For the L!-optimal control (minimum fuel control), the most detailed information 
can be obtained from the classical book by Athans and Falb [2]. The equivalence 
theorem between L? and L! optimal controls was first proved in [82, 83]. 

In this book, we consider only linear systems, but the equivalence holds for non- 
linear systems of the following type: 


a(t) = f(a(t))+¢(a@))u(t), +> 0. (8.63) 


See [82, 83] for details. 
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Necessary conditions of the L°-optimal control are also obtained in [19] by the 
non-smooth version of Pontryagin’s minimum (or maximum) principle [24]. For 
the theory of LP spaces, see [65, 99, 114]. 

For feedback control implementation of maximum hands-off control, one can 
adopt the model predictive control [55, 86] and the self-triggered control [83]. 

An interesting extension of maximum hands-off control is distributed control 
for multi-agent systems discussed in [52]. 
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Numerical Optimization by Time 
Discretization 


As we have seen in Section 8.7, the maximum hands-off control (or L!-optimal 
control) is obtained in a closed form when the plant is very simple as the double 
integrator given in Example 7.2 (p. 127). However, for a general system 


x(t) = Aa(t)+bu(t), t>0, 2(0)=€eRY%, (9.1) 


we need to rely on numerical computation to obtain the optimal control. In this 
chapter, we introduce the method of time discretization to numerically obtain the 
L! -optimal control. 


Key ideas of Chapter 9 


e By time discretization, the L!-optimal control problem (L!-OPT) is 
reduced to a finite-dimensional £! optimization problem. 

e In time discretization, the control is assumed to be piecewise constant 
by a zero-order hold. 

e The reduced £! optimization can be efficiently solved by ADMM. 
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9.1 Time Discretization 


First, we discretize the time interval [0, T] into n subintervals as 
[0, T] = [0,h) U [h, 2h) U - - - U [nh — h, nh], (9.2) 


where h > 0 is the sampling time and n € N is the number of subintervals such 
that T = nh. 

On each subinterval, we assume the control u(t) is constant. More precisely, we 
assume the control is given by 


u(t) = u(kh) =uglk], te [kh,(e+Dh), k=0,1,2,...,n—1. (9.3) 
This is the output of a zero-order hold of a discrete-time signal 
uq = {uq[0], wall], ..., wala — 1}. (9.4) 


This assumption is actually reasonable for networked digital control systems where 
control values are computed in a digital computer, transmitted through a wireless 
communication network, and applied to an actuator through a D/A converter. The 
zero-order hold is the simplest model of a D/A converter. 

Let us compute the state transition under the zero-order assumption on the con- 
trol. The solution to the state-space equation in (9.1) is given by (see Exercise 7.1 
on p. 127) 


t 
x(t) = eA“ (to) + | e^) bu(t)dr, (9.5) 
J to 


where 0 < tọ < ti. Take 
to=kh, t =kh+h, ke{0,1,2,...,n— 1}. (9.6) 


Then from (9.5) we have 


kh+h 
a(kh +h) = e4"a(kh) + f eAkh+h—t) bu (7) dt 
(9.7) 


h 
= eh a (kh) + | e^h-)bu(t + kh)dt. 
0 


Define 


xalk] = a(kh), uglk] Su(kh), k=0,1,...,n-1, (9.8) 
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and 
xain] = (T). (9.9) 


From the zero-order-hold assumption (9.3), the control u(t) takes a constant value 
ug[k] = u(kh) on the subinterval [kh, kh +h) as shown in Figure 9.1. Then from 
(9.7) we have 


h 
xalk + 1] = ef" xalk] + (/ Mt-par) uq[k]. (9.10) 
0 


It follows that the differential equation (9.1) is transformed into the following dif- 
ference equation: 


xalk + 1] = Agag[k] + baualk], k=0,1,...,n— 1, (9.11) 


where 
h 
Aq Ê e^}, nef e“'bdt. (9.12) 
0 


Next, define the control vector 


ug[0] 
uall] 


l> 


e R”. (9.13) 
ie 1] 
By using this, the terminal state æ(T ) is described as 
a(T) = xain] = —C+ Ou, (9.14) 
where 


(9.15) 


Figure 9.1. Zero-order hold output of discrete-time signal {ug[k]}. 
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Exercise 9.1. Show the equation (9.14) by solving the difference equation (9.11). 


9.2 Controllability of Discretized Systems 


The discrete-time system (9.11) is called the zero-order-hold discretization or step- 
invariant discretization of the continuous-time system (9.1). Figure 9.2 shows the 
zero-order-hold discretization of (9.1). In this figure Hp is the zero-order hold with 
sampling time h, which outputs a constant value ug[k] = u(kh) over [kh, (k + 
1)h), k = 0,1,2,... (see Figure 9.1). Also, Sp, is the ideal sampler that outputs 
the sampled value ag[k] = a(kh), k = 0,1, 2, .. . of the continuous-time signal 
a(t). 

As discussed above, the discrete-time system from ug[k] to æq[k] in Figure 9.2 
is a linear time-invariant discrete-time system as in (9.11). Then, under this dis- 
cretization, the stability is preserved; if A is stable, that is, if the eigenvalues of A 
have non-positive real parts then Aq is Schur stable, that is, the eigenvalues of Aq 
lie in the closed unit circle in C. This is easily shown from the spectral mapping 
theorem: the set of the eigenvalues of Ag = e^! is given by {eA1h eal} where 
A; is the i-th eigenvalue of A. 

On the other hand, we cannot say the controllability is not always preserved 
under the zero-order-hold discretization. To discuss this, we introduce the concept 
of pathological sampling. 


Definition 9.2 (pathological sampling). Let (A) be the set of eigenvalues of A. 
The sampling time h > 0 is said to be pathological if there exist 41, 42 € A(A) such 
that 


1. Ay # Ad 
2. Redy = Re 22 
3. there exists k € {41, +2, ...} such that 


Bik 
int ine = —. (9.16) 


Intuitively, pathological sampling synchronizes an oscillation mode in the plant. 
The following illustrates pathological sampling. 


Figure 9.2. Zero-order-hold discretization: continuous-time system a(t) = Ax(t) + bu(t) is 
discretized by zero-order hold Hn and ideal sampler S, with sampling time h. 
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Example 9.3. Let us consider a linear system 
y¥)=-y®, y@)=0, yO)=1. (9.17) 
Then the solution of this differential equation is given by 
y(t) = sint. (9.18) 
If we sample this output with sampling period h = n, then we have 
y(kh) =sinkh =0, k=0,1,2,... (9.19) 


This is an example of pathological sampling. The state-space representation of (9.17) 


is given by 
d | x(t) _{ 0 1 x1(t) x1(0) _ fo 


where xı (t) = y(t) and x2(t) = y(t). Then the matrix 


0 1 
A= 2 | (9.21) 
has two eigenvalues 14 = +j satisfying 
2k 
inir tad =e i (9.22) 


with k = 1. Therefore, h = m is certainly pathological. 


When the sampling is non-pathological, then the controllability is preserved as 
shown in the following theorem. 


Theorem 9.4. Assume that the sampling time h is non-pathological. Then (A, b) is 
controllable if and only if (Aq, ba) is controllable. 


The proof is found in [22]. 


9.3 Reduction to Finite-dimensional Optimization 


Now we reduce the L}-optimal control problem (L!-OPT) (p. 150) into a finite- 
dimensional £! optimization problem by the time discretization. 

First, the constraint on the magnitude of control ||u||oo < 1 is equivalently 
written by 


lugik]| < 1, Vk € {0,1,2,...,n— 1}, (9.23) 


170 Numerical Optimization by Time Discretization 


under the zero-order-hold assumption (9.3). Let us denote by ||2#|| geo the °° norm 
of a vector u (see (2.29) in Chapter 2). Then the above inequality is equivalent to 


llull < 1. (9.24) 


Next, under the zero-order-hold assumption, the L! cost function becomes 


T 
n= f uod 


n—l1 


(k+1)h 
=Z | woar 
kh 


k=0 


n=l p(k+1)h 
9.25 
=> f lu glk] |dt Oe) 
k=0 kh 


n—-1 
= Do lualklļh 
k=0 


=hllulle. 


Now the L!-optimal control problem (L! OPT) is reduced to the following 
finite-dimensional £! optimization problem: 


minimize lulla subjectto Du = ¢, ull < 1. (9.26) 
ueR” 


This optimization problem is a convex optimization since the cost function (£! 
norm) is a convex function, and the constraint set 


C£{ueR": ®u=, llull < 1} (9.27) 


is a convex set in R”. We can easily solve this problem by using CVX in MAT- 
LAB (see Section 3.3 in Chapter 3, p. 51). A MATLAB program to solve the £! 
optimization (9.26) using CVX is given in Section 9.5. 


9.4 Fast Algorithm by ADMM 


If the order d of the system and the number n for discretization are not so large, 
you can obtain a solution easily by CVX. However, if you want to use the control in 
a feedback loop, then you must solve the problem in real time. Also, in real systems, 
the control algorithm should be implemented in a microcomputer, which often has 
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just a cheap computational ability and is hard to run CVX. In such a case, we need 


to implement a fast and simple algorithm for the specific £! optimization problem 


(9.26). For this purpose, we can use the efficient algorithms studied in Chapter 4. 
In particular, we here use ADMM (Alternating Direction Method of Multipliers) 


studied in Section 4.5 to solve (9.26). 
First, define the unit ball C1 C R” with the £% norm by 


Cı = {we R” : |lulleo < 1}. 
Also, let Cz be a singleton of ¢ € R4, that is, 
C2 Ê (Ch. 


Define the indicator functions of the sets C and C2, respectively, by 


0, if lulles < 1, 
le, (u) = ; 
œ, if llull% > 1, 
0, ifx=Q, 
Ic, (x) = : 
co, ifaw. 


Then the optimization problem (9.26) is equivalently described by 
minimize {|lell¢ + Ic, (u) + Ic, (®u)}. 
Next, define new variables zo, z1 € R”, z2 € Rf by 
zo = z1 =u, z2 = u. 
Then the problem (9.32) becomes 


minimize {llzolle + Ic, (21) + Ic, (22)} subject to z = Yu, 
ueR",zeR” 


where v £ 2n + d, and 


Zo I 
z2|/z,|eR’, YA/7] eR". 
22 O 


Defining two functions fı and fz by 


fiu) £0, falz) = lizola + Ic, (21) + Ie (22) 


(9.28) 


(9.29) 


(9.30) 


(9.31) 


(9.32) 


(9.33) 


(9.34) 


(9.35) 


(9.36) 
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we finally obtain the standard optimization problem for ADMM (see (4.96), p. 78): 


minimize filu) + fo(z) subject to z = Pu, (9.37) 


ueR",z 


for which ADMM algorithm is given by (see Section 4.5.1, p. 78) 


ulk + 1] := arg min [j E -Yu — z[k] + vik] Ag (9.38) 
wel” 

z[k + 1] := prox, »,(Pulk + 1] + v[k]), (9.39) 

vik + 1] := v[k] + Bulk + 1] — z[k + 1]. (9.40) 


Let us compute the functions in (9.38)—(9.40). First, since f = 0, the first step 
(9.38) is minimization of a quadratic function, and it is reduced to the following 
linear equation: 


u[k + 1] = arg min {= —|Pvu — z[k] + v[k] le} 


ueR" (9.41) 
= (Pw)! pl (z[k] — v[k]). 
Note that YTY = 27 + O! Ô is non-singular and the matrix 
M £ (ely YT (9.42) 


can be computed off-line (i.e. outside the iteration). 

The size of YTY is n x n, and if the number n of time discretization is very 
large, then the computation of the inversion may take large computational time. 
In this case, we can adopt the matrix inversion lemma 


(X +UYVY! = X7! — XIU (YT! + VXU) IV XT}. (9.43) 
By this, the inverse matrix (Y')—! can be rewritten as 
1 1 
(PTP! = (21 + lo)! = ~ z2 (2 +00!) lo. (9.44) 


This requires inversion of matrix 27 + PO! of sized x d, and ifd <n then the 
computational time can be significantly reduced. 
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Figure 9.3. Saturation function sat(u) = sgn(u) min{|u|, 1}. 


The second step (9.39) in ADMM algorithm can be split into three simple opti- 
mization problems with variables zo, z1, and z2 defined in (9.35). For the variable 
zo, we use the proximal operator of the £!, which is the sofi-thresholding opera- 
tor defined in (4.46) (see also Figure 4.8 on p .67), that is, the i-th element of 
PTOX, Jully (u) is given by 


Ui— y, uiy, 
[prox, pa W]; = S; W; = 7 0, luil < y, (9.45) 
uit y, Uuis-y, 


where u; is the i-th element of vector u. 

For the variables zı and z2, we need to compute the proximal operators of indi- 
cator functions. From (4.38) (p. 65), the proximal operator of the indicator func- 
tion on a closed and convex set C is given by the projection IIe onto C. Therefore, 
the second step for variables zı and z2 are reduced to IIc, and He. 

The projection Iç, is given by 


sat(u1) 


sat(u2) 


Ic (u) = ,  sat(u) Ê sgn(u) min{ļul, 1}, (9.46) 


ate 
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where the function sat(-) is called the saturation function. Figure 9.3 shows the graph 
of the saturation function. The other projection Ie, = Hic) is simply given by 


Ie, (z) =€, (9.47) 
In summary, the second step for variable z is given by 


S, (ulk + 1] + vo[k]) 
z[k +1] = | We, (ufk + 1] + vi[k]) |, (9.48) 
Ç 


where we split the vector v[k] as v = [vd, vl, vi]! consistent with the split of 
z in (9.35). 
Now we obtain ADMM algorithm to solve the £ i optimization (9.26): 


ADMM algorithm to solve the £! optimization problem (9.26) 


Initialization: give initial vectors z[0], v[0] € R”, and real number y > 0 
Iteration: fork = 0,1,2,...do 


ul[k + 1] = M(z[k] — v[k]) 
S, (u[k + 1] + volk]) 

z[k + 1] = | IIc, (ulk + 1] + vi [k]) (9.50) 
¢ 

vik + 1] = v[k] + Yufk +1] — z[k +1], k=0,1,2,... 


(9.49) 


(9.51) 


In this algorithm, the matrix M in (9.49) is given by (9.42). 
As mentioned in [12], ADMM algorithm is very fast and needs just some dozens 


of iterations to obtain a solution with a sufficient precision. This property is very 
important if you adapt the finite-horizon L! optimal control to the model predictive 
control [70], where real-time computation is essential. 
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9.5 MATLAB Programs 


We show MATLAB programs to solve the £! optimization problem (9.26). One is 
a program using CVX. The other is an implementation of ADMM algorithm. 


MATLAB program to solve £! optimization problem (9.26) via CVX 


clear 
%% System model 
% Plant matrices 
A = [0,1;0,0]; 
b = [0:1]; 
d = length(b); %system size 
% initial states 
xO = [1⁄1]; 
% Horizon length 
T=5; 
%% Time discretization 
% Discretization size 
n = 10000; % grid size 
h = T/n; % discretization interval 
% System discretization 
[Ad,bd] = c2d(A,b,h); 
% Matrix Phi 
Phi = zeros(d,n); 
v = bd; 
Phi(:,end) = v; 
for j = 1:n-1 
v = Ad*v; 
PhiC:,end-j) = v; 
end 
% Vector zeta 
zeta = -Ad*n*xO; 
%% Convex optimization via CVX 
cvx_begin 
variable u(n) 
minimize norm(u,1) 
subject to 
Phi*u == zeta; 
norm(u,inf) <= 1; 
cvx_end 
%% Plot 
figure; 
plot(O:T/n:T-T/n,u); 
title? Sparse control’); 
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MATLAB program to solve (9.26) with ADMM 


clear 
%% System model 
% Plant matrices 
A = [0,1;0,0]; 
b = [0;1]; 
d = length(b); %system size 
% initial states 
xO = [1:1]; 
% Horizon length 
T=5; 
%% Time discretization 
% Discretization size 
n = 1000; % grid size 
h = T/n; % discretization interval 
% System discretization 
[Ad,bd] = c2d(A,b,h); 
% Matrix Phi 
Phi = zeros(d,n); 
v = bd; 
Phi(:,end) = v; 
for j = 1:n-1 
v = Ad*v; 
PhiC,end-j) = v; 
end 
% Vector zeta 
zeta = -Ad*n*xO; 
%% Convex optimization via ADMM 
mu = 2*n+d; 
Psi = [eye(n);eye(n);Phi]; 
M = (0.5*eye(n) - 0.5*Phi’*inv(2*eye(d)+Phi*Phi’)*Phi)*Psi’; 
sat = @(x) sign(x).*min(abs(x),1); 
EPS = le-5; 
MAX_ITER = 100000; 
z = [zeros(2*n,1);zeta]; v = zeros(mu,]); 
r = zeta; 
k= 0; 
gamma = 0.05; 
while (norm(r)>EPS) & (k < MAX_ITER) 
u = M*(z-v); 
ZO = soft_thresholding(gamma,u+tv(1:n)); 
zl = sat(u+v(n+1:2*n)); 
z2 = zeta; 
z = [zO:zi:z2]; 
v = v + Psi*řu - Z; 
r = Phi*u - zeta; 
k=k+1; 
end 
%% Plot 
figure; 
plot(O:T/n:T-T/n,u,’LineWidth’,2); 
title’ Sparse control’); 
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9.6 Further Reading 


The time discretization discussed in this section is based on the fundamental theory 
of sampled-data control, for which you can refer to a standard textbook by Chen 
and Francis [22]. The concept of pathological sampling is also found in this book. 

Instead of the time discretization method, one can also use the shooting method 
for numerical computation of L!-optimal control. The shooting method is based 
on the necessary conditions by Pontryagin’s minimum principle. For the shooting 
method, see [10] for details. 
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Chapter 10 


Advanced Topics 


In this chapter, we introduce advanced topics in maximum hands-off control. 


10.1 Smooth Hands-off Control by Mixed L!/L? 
Optimization 


As we studied in Chapter 8, the maximum hands-off control (the L°-optimal con- 
trol) is bang-off-bang (Theorem 8.14, p. 157), that is, it is a piecewise constant 
function taking values of +1 or 0. This means that the maximum hands-off con- 
trol is discontinuous; the control changes its value between 1 and 0, or 0 and —1 at 
switching times. This is undesirable for some applications in which the actuators 
cannot move abruptly. In this case, one may want to make the control continuous. 
For this purpose, we add a regularization term to the L! cost Jı (u) in the L! opti- 
mal control problem (L!-OPT) (p. 150). That is, we consider the following cost 
function: 


1 r 1 
Jiz (u) = Allullı + zll = [ (am + sun?) dr, (10.1) 


where A > 0 is a fixed parameter. 
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The idea to add the L?-norm term is borrowed by the elastic net regularization’ in 
compressed sensing [119]. The elastic net regularization promotes sparsity with the 
grouping effect, where strongly correlated vectors are chosen at the same time. This 
ensures that the solution is not overly sensitive to small changes in the observation. 
From this idea, the L? term in (10.1) enhances continuity of the solution. 

With the cost function (10.1), we consider the following mixed L'/L?-optimal 
control problem. 


L! /L?-optimal control problem (L!/L?-OPT) 


For the linear time-invariant system 


“(t)=Aa(t)+bu(t), t>0, «(0)=€eRY%, 
find a control {u(t) : t € [0, T]} with T > O that minimizes 
1 
J(u) = Aljulla + 5 lel 


subject to 


a(T) = 0, 


lullo < 1. 


To discuss properties of the L! /L?-optimal control, we give necessary conditions 


of optimality by Pontryagin’s minimum principle. 
The Hamiltonian function associated to (L! y L?-OPT) is given by 


1 
Hæ, p,n) = p7 (A + bu) + n (Alul + Zlu). (10.2) 


We do not consider the abnormal case (i.e., 7 = 0) and assume y = 1. Let u* (t) 
denote the optimal control and x*(t) and p* (t) the resultant optimal state and 
costate, respectively. Then we have the following result. 


Lemma 10.1. The L!/L?-optimal control u* (t) satisfies 


u* (t) = -sat(; (p" ()"b)) f (10.3) 


1. The name “elastic net” is meant to suggest a stretchable fishing net that retains all the big fish. 
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Figure 10.1. Saturated shrinkage function sat(5;(v)). 


where S, (-) is the soft-thresholding operator (see Section 4.2.5, p. 66) defined by 


v+A  ifo <—-A, 
Si(v) = 40, if -A<o<A, (10.4) 
v— À, ifA<ov, 


and sat(-) is the saturation function defined by 


-l, ifv <—-l, 
sat(v) = 49, if —1<ov <1, (10.5) 
1, ifl <v. 


See Figure 10.1 for the graphs of sat(Sj(v)) in (10.3). 


Proof: From Pontryagin’s minimum principle, we have 


1 
u* (t) = arg min [oou + Alu] + sur} 
ue[—1,1] 2 


1, if p*(t)'b< —A-1 
—(p*(t)'b+A), if -A-1 <p*()'b< -A 
=a f -isp bei (10.6) 
—(p*(t)'b-A), if4<pP@t)'b<A+1 
-—]1, if A+1< p*(t)'b 


= ~sat(S; (p"()")) 


From Lemma 10.1, we have the following theorem. 


Theorem 10.2 (Continuity). The L!/L?-optimal control u* (t) is continuous in t 
over [0, T]. 
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Proof: Define 
īp) ê ~sat(S; (p"b)) (10.7) 


Since the composite function sat o Sj is continuous (see Figure 10.1), (p) is also 
continuous in p. It follows from Lemma 10.1 that the optimal control u* given in 
(10.3) is continuous in p*. Hence, u* (t) is continuous, if p* (t) is continuous in t 
over [0, T]. In fact, from (8.20) (p. 153), p* (t)' bis given by 


p*(t)'b= p* (0)' e7“ b, (10.8) 


which is continuous in t over R. 


Theorem 10.2 motivates us to use the L!/L? optimization in the L!/L?- 
optimal control problem (L!/L?-OPT) for continuous hands-off control. 

In general, the degree of continuity (or smoothness) and the sparsity of the con- 
trol input cannot be optimized at the same time. The weight parameter A can be 
used for trading smoothness for sparsity. Lemma 10.1 suggests that increasing the 
weight A makes the L'/L? optimal control u*(t) sparser (see also Figure 10.1). 
On the other hand, decreasing 2 smoothens u* (t). 


Example 10.3. Let us consider the following linear system 


0 1 0 0 0 

da(t) 001 0 0 

a Coti a(t) + 0 u(t). (10.9) 
00 0 0 1 


We set the final time T = 10, and the initial and final states as 
x(0) = [0.5,0.5,0.5,0.5]', æ(10) = 0. (10.10) 


Figure 10.2 shows the L'/L* optimal control with weights à = 1. The maximum 
hands-off control is also illustrated. We can see that the L'/L?-optimal control is con- 
tinuous but sufficiently sparse. 


10.2 Discrete-valued Control 


As we observed in Chapter 8, the maximum hands-off control (or the L°-optimal 
control) takes values in an alphabet {—1, 0, 1}. Such a control is called a discrete- 
valued control, since the control takes a finite number of values. Discrete-valued 


2. The word alphabet is borrowed from information theory [27]. An alphabet is a set of a finite number of 
elements that are used to represent signals of interest. 
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Figure 10.2. Maximum hands-off control (dashed) and L!/L?-optimal control (solid). 
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Figure 10.3. An example of discrete-valued control that takes three values of Uj, U2, 
and U3. 


control is important in networked control systems where the bandwidth of the 
network is limited, since discrete-valued signals can be effectively compressed. 

We here generalize the property of discreteness in maximum hands-off control 
by the sum-of-absolute-values (SOAV) optimization. 


10.21 Sum-of-Absolute-Values (SOAV) Optimization 
Let us consider discrete-valued control for the linear time-invariant plant 
a(t) = Ax(t)+bu(t), t>0, 2(0)=€eR%, (10.11) 
where the control u(t) takes N real numbers 
Ui < U2 <... < UN. (10.12) 


That is, we consider a discrete-valued control with alphabet {U1, U2,..., Un}. 
Figure 10.3 shows an example of discrete-valued control. We then seek a discrete- 
valued control that achieves æ(T) = 0, given the initial state (0) = € and the 
control time T > 0. 
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A standard method to obtain discrete-valued control is to describe the problem 
as a mixed-integer programming problem [6]. However, this method requires a lot 
of computational time, which glows exponentially as the size of problem glows, and 
hence this method is hard to apply for a large scale problem. Instead, we consider 
convex relaxation of this optimization problem of discrete-valued control. 

We first define the feasible controls that drive the state x(t) from the initial state 
x(0) = £ to the origin in time T > 0, satisfying 


Uı < u(t) < Uy, Vt €([0,T). (10.13) 
We denote by U (T, £) the set of feasible controls. We assume that £ € R? and T > 


0 are given such that U/(T, £) is non-empty. For a feasible control u € U(T, §), 
define the following cost function: 


N 
Jou) = $ wjllu — Ujllo. (10.14) 
j=l 
where w1, W2,..., Wy are weights that satisfy 
wi>0, witwot:::-+wy = 1. (10.15) 


Minimizing the cost function (10.14) may promote discreteness of the control 
to take values in {U1, ..., Ux}. This can be explained as follows. A discrete-valued 
control is a piecewise constant signal as shown in Figure 10.3. Ifu(t) = U; for t in 
some time intervals with a positive length, then the function u(t) — U; is zero over 
the intervals, and hence it is sparse. Namely, the L? norm of the function u — Uj 
should be smaller than T. If we choose the weights w1, ..., wy according to the 
importance of the values U4, ..., Uy and minimize the cost function (10.14), we 
may obtain a discrete-valued feasible control. 

The cost function (10.14) is discontinuous and non-convex, and hence it is dif- 
ficult to directly obtain the optimal solution as in the case of L°-optimal control. 
We then adopt the L! relaxation, that is, we use the L! norm instead of the L° 
norm in (10.14): 


N TN 
noX wjlu-un= f Š wjlult) — Ujldt (10.16) 
j=l % jai 


We call this cost function the sum of absolute values or SOAV for short, and the 
optimal control that minimizes the SOAV cost function the sum-of-absolute-values 
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optimal control or SOAV-optimal control. We describe the SOAV-optimal control 
problem as follows: 


SOAV-optimal control problem (SOAV-OPT) 


For the linear time-invariant system 


x(t) = Ax(t) + bult), t>0, 2(0)=€eR’, 


find a control {u (t) : t € [0, T]} that minimizes 


N T N 
Aw) = Yo) —Ujhi = f SS wjlu® — Ujldt 
j=l jal 


subject to 


x(T)= 0, 


Uı su(t)< Un, Vte [0,7]. 


10.2.2 Discreteness of SOAV-optimal Control 


Here we show that the SOAV-optimal control is a discrete-valued control taking 
values in {U),..., Uy} under some conditions. 

Let u* € U(T, €) be an SOAV-optimal control minimizing the cost function 
(10.16), that is, 


u* =argmin Jı (u) subject to u € U(T, &). (10.17) 


For the optimal control problem (SOAV-OPT), we analyze the solution u* by using 
Pontryagin’s minimum principle. 
The stage-cost function L(u) of the SOAV cost function (10.16) is given by 


N 
L(u) = >) wjlu — Uj]. (10.18) 
j=l 


Figure 10.4 shows an example of function L(u). As shown in this figure, the 
stage-cost function L(u) is a continuous and piecewise linear function. Also, 
since the function L(u) is a convex combination of convex functions |u — Uj|, 
j =1,...,N, L(u) is convex in u. That is, the optimization problem in (SOAV- 
OPT) is a convex optimization problem. 
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uU 


Figure 10.4. Piecewise linear function L(u). 


Then the Hamiltonian for (SOAV-OPT) is defined by 


H"(x,p,u) = p' (Ax + bu) + nL (u) 
N 
=p" (Ax + bu) +9 X wjlu— Uj. 
j=1 


(10.19) 


Here we assume?’ 7 = 1. Let x* and p* be respectively the optimal state and costate 
with the optimal control u*. From the minimum principle, we have 


u*(t)= argmin {p*(t)'(Ax*(t) + bu) + L(u)} 


ue[U},UN] 
(10.20) 
= argmin THOM + L(u)}. 
ue[U1, Un] 
Let us solve the minimization problem in (10.20). 
Since the function L (u) is piecewise linear, L (u) can be written as 
au + bi, u € [U1, U2], 
au + bo, u € [U2, U3], 
Luj= (10.21) 
ay-1u+by-1, u €[Un-1, Un], 
where 
k N 
naD- > oy 
j=1 =k+1 
ý a (10.22) 
k N 
b=- wjUj+ >) wjUj, k=1,2,...,N—-1 
j=l j=k+1 


3. Ify = 0, then the extremum solution is a bang-bang control that takes values of Uj or Uy, which is a 
discrete-valued control. 
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Fix t € [0, T] and define a £ p*(t)'b e R. Since L (u) is continuous, and the 
following inequality 


ay <a? <: <ayn-] (10.23) 


holds, we can compute the minimizer of 


(aj +a)u + bı, u € [U1, U2], 
wincio Ae MEERE i 
(an-ı +a)u+by-1, u €[Un-1, Un], 
for u € [U1, Uyn]. 
(i) Ifa; +a > 0, then from (10.23) we have 
0<ajta <a +a <::: <ay_) +4, (10.25) 


and the slopes (ax + a) of the linear functions in (10.24) are all positive. 
See (i) of Figure 10.5. Hence we have 


argmin h(u) = Uj. (10.26) 
ue[U,,UN] 


th(u) 


U Us Us U4 Ui Us Us ij 
(ii) (iv) 


Figure 10.5. 4 cases of piecewise linear function h(u) = au + L(u). 
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(ii) 


(iti) 


(iv) 


Ifa, +a < 0andag+ı +a >0(k =1,...,N — 2), then from (10.23) 


we have 
aj+a <ar+a <.: <ag+a <0, (10.27) 
and 
O < ak4ı +a < ak}2 ta <-++ <ay-1 +a. (10.28) 


The sign of the slopes of the linear functions in (10.24) changes from neg- 
ative to positive at u = Ugą+1. See (ii) of Figure 10.5. Hence, we have 


argmin h(u) = Uk+1. (10.29) 
ue[U),UN] 


Ifay—1 +a < 0, then we have 
ata<aota <::-<ay_-j+a <Q, (10.30) 


and the slopes (ax + a) in (10.24) are all negative. See (iii) of Figure 10.5. 
Hence we have 


argmin h(u) = Uy. (10.31) 
ue[U,,UNn] 


If there exists k € {1,2,..., N — 1} such that ag + a = 0, then the slope 
becomes zero over the interval [Uz, Ux41]. Hence we have 


argmin h(u) = [Ux, Uk+1]. (10.32) 
ue[U1,UN] 


In this case, we cannot determine the unique value for u* (t). 


In summary, the SOAV optimal control u* (t) satisfies the following: 


Ui, if —a, < p*(t)'b 
Ud, if — a < p*(t)'b < —a] 


u* (t) = ; (10.33) 


Uy-1, if -an-ı < p*(t)'b < —ay-2 
Uy, if p*(t)'b < —ay-1 


u*(t) € [Uk, Uri], ifp*(t)'b= —ay, k=1,2,...,N—1 (10.34) 


From (10.34), if 


p*(t)'b#—azy, k=1,2,...,N—1 (10.35) 
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holds for almost all t € [0, T], then we can see that u*(t) takes discrete-values in 
{U,,..., Un} for almost all t € [0, T]. Let us consider a sufficient condition for 
this. 

We see that (10.35) holds for almost all £ € [0, T] if and only if 


u({t € [0, T]: p*()"b = —ax}) = 0 (10.36) 


for k = 1,2,..., N — 1. We say the SOAV optimal control is non-singular if 
(10.36) holds. Then we have the following theorem: 


Theorem 10.4. Assume that the SOAV optimal control is non-singular. Then the opti- 
mal control u* (t) takes values in {U,,..., Un} for almost allt € [0, T]. 


For the non-singularity, we have the following theorem. 
Theorem 10.5. Assume that the pair (A, b) is non-singular“ Assume also that 
k N 
S uy > Wj (10.37) 
j=1 j=k+1 
holds fork = 1,2, ..., N — 1. Then the SOAV optimal control is non-singular. 
Exercise 10.6. Prove Theorem 10.5. 


The condition (10.37) in Theorem 10.5 is a sufficient and necessary condition 
for the slopes of the linear functions in (10.24) to be nonzero. 


Example 10.7. Let us consider a design example of SOAV-optimal control. We consider 
the 4-th order plant given in (10.9) in Example 10.3. The final time T = 10 and the 
initial and final states are the same as (10.10). 

The alphabet is given by {—1, —0.5, 0, 0.5, 1}, that is N = 5 and 


Ui = —1, Uy = —0.5, U3 = 0, U4 = 0.5, U5 = 1. (10.38) 


The weights in the cost function (10.16) are set as 


1 
W1 = W2 = W3 = W4 = W5 = z (10.39) 


Figure 10.6 shows the obtained SOAV-optimal control. In this figure, the max- 
imum hands-off control (L!-optimal control) discussed in Chapter 8 and the 


4. The pair (A, b) is said to be non-singular if the pair (A, b) is controllable and A is non-singular. 
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Figure 10.6. SOAV-optimal control (solid), L!-optimal control (dashed), and L?-optimal 
control (dotted). 


L?-optimal control’ that minimizes the L? norm 


T 
Jo(u) = f |u(t)|-dt (10.40) 
0 


are also shown. Note that the L!-optimal control is bang-off-bang and takes values 
of +1 and 0. On the other hand, the L?-optimal control is a smooth control. The 
SOAV-optimal control is between them. It takes discrete values in the alphabet 
{—1, —0.5, 0, 0.5, 1}, that is a quantization of the L?-optimal control. 

Figure 10.7 shows the state variables x(t), ..., x4(¢) in the state a(t) and the 
SOAV-optimal control. We can see that by the obtained discrete-valued control 
u(t), all the state variables converge to the origin in the time T = 10. Note that this 
cannot be possible when one uses a quantized version of the L?-optimal control by a 
static quantizer; there should be quantization errors that perturb the state trajectory. 


10.3 Time-Optimal Hands-off Control 


In this section, we consider an optimal control that takes account of sparsity 
and time-optimality at the same time. Let us consider the following linear time- 
invariant system: 


z(t) = Axt) + bult), t>0, 2(0)=€eR?. (10.41) 


5. The L?-optimal control is also known as minimum energy control [2, Section 6-18]. 
P gy 
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state variables x,(t) and control u(t) 


0 2 4 6 8 10 
time (sec) 


Figure 10.7. State variables x;(t),...,. x4(t) and the SOAV-optimal control u(t), t € [0, 10]. 


The control objective is to drive the state to the origin. Here we do not fix the final 
time T. As in the minimum-time control in Chapter 7, the final time T is also an 
optimization variable. 

First, we consider the feasibility of the control. For the system (10.41), a control 
u is said to be feasible if there exists a finite time T > O such that by {u(t) : t € 
[0, T]} satisfying 


lu(t)| <1, Vt e [0, T], (10.42) 


the state a(t) in (10.41) is steered from æ(0) = € to (7) = 0. From the defi- 
nition of the controllable set R in (7.30) (p. 134), there exists a feasible control if 
the initial state € is in the controllable set R. Therefore, we assume € € R. Using 
the feasible set U (T, £) with fixed T > 0 (see Section 7.1.3), the set of all feasible 


controls is given by 


UE) = [J UCT, £). (10.43) 


T>0 


Next, we formulate the optimal control problem. We seek a feasible control u € 
U(E) that minimizes the L? norm of u and the response time T at the same time. 
For this, we consider the following cost function: 


Jou) = Aljullo +T, (10.44) 


where A > 0 is a weight parameter for a trade-off between the two requirements. As 
usual, we relax the L° norm in (10.44) by the L! norm |||], namely, we consider 
the following cost function: 


J(u) = Aull: +T, (10.45) 
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Now we formulate our problem. 


L!-time-optimal control problem (L!-T-OPT) 


For the linear time-invariant system 


x(t) = Ax(t)+bu(t), t>0, 2(0)=éeR%, 


find a control {u(t) : t € [0, co)} that minimizes 


Jiu) = Alul +T 


subject to 


x(T)=0, 


lullo < 1. 


We call the optimal solution the L!-time-optimal control. 
The existence theorem for the L!-time-optimal control is proved similarly to the 


time-optimal control (Theorem 8.15, p. 158). 


Theorem 10.8. For any initial state E € R, there exists at least one L'-time-optimal 
control. 


You can find the proof in [57]. 
The Hamiltonian for (L!-T-OPT) is given by 


H"(a, p,u) = p' (Ax + bu) + n(Alu| + 1) (10.46) 
We do not consider the abnormal case (y = 0) and assume y = 1. Then the 


optimal control u* (t) of (L!-T-OPT) satisfies 


u*(t) = arg min H! (æ, p, u) = arg min [PO bu + Alul} (10.47) 
ue[—1,1] ue[—1,1] 
From this, we have 

l, if p*(t)'b < —1, 

u*(t)= 40, if —4<p*(t)'b <å, 
-l, ifå< p* (t)! b, (10.48) 

u*(t) € [0,1], if p*() "b= —4, 

u*(t) € [-1,0], if p*@)'b=A. 
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If p*(t)'b = +2 holds only on sets of measure zero (i.e., if p* (t)'b # +A 
almost all t € [0,7], then the L!-time-optimal control is bang-off-bang. From 
Lemma 8.3, we have the following theorem: 


Theorem 10.9. Assume that the pair (A, b) is non-singular. Then the L'-time- 
optimal control is bang-off-bang (if it exists). 


This theorem along with Theorem 10.8 leads to the following equivalence theorem: 


Theorem 10.10. Assume E € R and the pair (A, b) is non-singular. Then the L'- 
time-optimal control is equivalent to the L°-time-optimal control that minimizes the 
cost function (10.44). 


10.4 Further Reading 


The smooth hands-off control by the mixed L!/L? optimization was first pro- 
posed in [83]. Another formulation for smooth hands-off control by the CLOT 
(Combined L-One and Two) norm was also proposed in [85]. The CLOT norm is 
defined by 


lullcror = 21 lulli + Azilulle, (10.49) 


with parameters A} > 0 and Az > O such that 4; + Az = 1. Compared with 
the mixed L!/L? cost function in (10.1), the L? term in the CLOT norm is not 
squared. The CLOT-optimal control is also continuous but sparser than the mixed 
L'/L?-optimal control in (L'/L?-OPT). 

The SOAV optimal control has been proposed in [53, 56]. The idea of 
SOAV cost function was first proposed in [77] for discrete-valued signal recon- 
struction. The SOAV optimization was then applied to digital communications 
[46, 101, 102]. 

The L!-time-optimal control was first proposed in [57]. 
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